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(without alignments) 
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Title: 
Perf ec 
Sequence : 

Scoring table 



US-10-624-932-2 
Perfect score: 4791 

1 MAVRPGLWPALLGIVLAAWL AVAGLGQPDAGLFTVSEAEC 898 

BLOSUM62 

Gapop 10.0 , Gapext 0.5 



1586107 



Searched: 1586107 seqs, 282547505 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : A_Geneseq_29 Jan04 : * 

1: geneseqpl980s : * 

2 : geneseqpl990s : * 

3: geneseqp2000s : * 

4: geneseqp2001s : * 

5 : geneseqp2002s : * 

6: geneseqp2003as : * 

1 : geneseqp2003bs : * 

8 : geneseqp2004s : * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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Result 



% 
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No. 


Score 


Match 


Length 


DB 


ID 


Descript 


1 


4791 


100.0 


898 


5 


AAU85403 


Aau85403 


2 


4781 


99.8 


898 


5 


AAU97899 


Aau97899 


3 


4698.5 


98.1 


899 


5 


7\AU79939 


Aau79939 


4 


4638 


96.8 


898 


2 


AAW78898 


Aaw78898 


5 


4638 


96.8 


898 


5 


AAU10543 


Aaul0543 


6 


4638 


96. 8 


898 


5 


AAU97900 


Aau97900 


7 


4526.5 


94.5 


943 


4 


AAM79128 


Aam7912 8 


8 


4413 


92.1 


842 


5 


AAU74818 


Aau74818 


9 


2815 


58.8 


556 


2 


AAW78899 


Aaw78899 



Human pro 
Human net 
Human UNC 
Rat UNC-5 
Rat netri 
Rat netri 
Human pro 
Human REP 
Human UNC 



10 


2755 


57. 


5 


931 


4 


AAB50691 


Aab50691 


Human 


UNC 


11 


2755 


57. 


5 


931 


7 


ADE63098 


Ade63098 


Human 


Pra 


12 


2755 


57. 


5 


982 


4 


ABG11551 


Abgll551 


Novel 


hum 


13 


2578 


.5 


53. 


8 


945 


7 


ADE63096 


Ade63096 


Rat Prote 


14 


2571 


.5 


53. 


7 


943 


2 


AAW78900 


Aaw78900 


Rat UNO 5 


15 


2563 


.5 


53. 


5 


933 


5 


AA018734 


Aaol8734 


Human 


NOV 


16 


2563 


.5 


53. 


5 


933 


5 


AA018735 


Aaol8735 


Human 


NOV 


17 


2558 


.5 


53. 


4 


945 


4 


AAU12244 


Aaul2244 


Human 


PRO 


18 


2558 


.5 


53. 


4 


945 


6 


AB017688 


Abol7688 


Novel 


hum 


19 


2558 


.5 


53. 


4 


945 


6 


ABU80942 


Abu80942 


Human 


PRO 


20 


2558 


.5 


53. 


4 


945 


6 


ABU66642 


Abu66642 


Human 


PRO 


21 


2558 


.5 


53. 


4 


945 


6 


ABU59723 


Abu59723 


Novel 


sec 


22 


2558 


.5 


53. 


4 


945 


6 


AB024913 


Abo24913 


Human 


sec 


23 


2558 


.5 


53. 


4 


945 


6 


ABU66918 


Abu66918 


Human 


sec 


24 


2558 


.5 


53. 


4 


945 


6 


ADA45665 


Ada45665 


Novel 


hum 


25 


2558 


.5 


53. 


4 


945 


6 


ADA76096 


Ada76096 


Human 


PRO 


26 


2558 


.5 


53. 


4 


945 


6 


ADA18746 


Adal87.46 


Human 


PRO 


27 


2558 


.5 


53. 


4 


945 


6 


ADA61369 


Ada61369 


Homo sapi 


28 


2558 


.5 


53. 


4 


945 


6 


ADB19154 


Adbl9154 
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hum 


29 


2558 


.5 


53. 


4 


945 


6 


ADB27695 


Adb27695 
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30 
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.5 


53. 


4 


945 


6 


ADA86174 
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hum 


31 
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.5 


53. 


4 


945 


6 


ADB15738 
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32 
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.5 


53. 
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945 


6 


ADA47524 
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PRO 


33 


2558 


.5 


53. 


4 


945 


6 


ADA67319 


Ada67319 
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PRO 


34 


2558 


.5 


53. 


4 


945 


6 


ADB30326 
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PRO 


35 


2558 


.5 


53. 


4 


945 


6 


ADA85622 


Ada85622 


Novel 


hum 


36 


2558 


.5 


53. 


4 


945 


6 


ADA96834 


Ada96834 


Human 


PRO 


37 


2558 


.5 


53. 
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945 


6 


ADA79138 


Ada79138 


Human 


PRO 


38 


2558 


.5 


53. 


4 


945 


6 


ADA87277 


Ada87277 


Novel 


hum 


39 


2558 


.5 


53. 


4 


945 


6 


ADB16479 


Adbl6479 
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PRO 
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2558 


.5 


53. 
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945 


6 


ADA91571 


Ada91571 


Novel 


hum 


41 


2558 


.5 


53. 


4 


945 


6 


ADB14634 


Adbl4634 


Human 


PRO 


42 


2558 


.5 


53. 


4 


945 


6 


ADB18595 


Adbl8595 


Novel 


hum 


43 


2558 


.5 


53. 


4 


945 


6 


ADA93810 


Ada93810 


Human 


PRO 


44 


2558 


.5 


53. 
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945 


6 


ADB19706 


Adbl9706 


Novel 


hum 


45 


2558 
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945 


6 


ADB13018 


Adbl3018 


Human 


PRO 



ALIGNMENTS 



RESULT 1 
AAU85403 

ID AAU85403 standard; protein; 898 AA. 
XX 

AC AAU85403; 
XX 

DT 21-MAY-2002 (first entry) 
XX 

DE Human protein NOV1 . 
XX 

KW Human; NOVX; cardiomyopathy; atherosclerosis; diabetes; 

KW cell signal processing disorder; metabolic disorder; obesity; infection; 

KW anorexia; cancer-associated cachexia; cancer; neurodegenerative disorder; 

KW Alzheimer's disease; Parkinson's disease; immune disorder; 

KW haematopoietic disorders; dyslipidaemia ; pain; asthma; hypertension; 

KW osteoporosis; Crohn's disease; multiple sclerosis; angina pectoris; 



KW myocardial infarction; ulcer; allergy; benign prostatic hypertrophy; 

KW psychosis; neurological disorder; - anxiety; schizophrenia; 

KW manic depression; dementia; dyskinesia; Huntington's disease; 

KW Gilles de la Tourette's syndrome; gene therapy. 

XX 

OS Homo sapiens. 
XX 

PN WO200210216-A2. 
XX 

PD 07-FEB-2002. 
XX 

PF 30-JUL-2001; 2001WO-US024225 . 
XX 

PR 28-JUL-2000; 2000US-0221409P . 

PR 04-AUG-2000; 2000US-0222840P . 

PR 04-AUG-2000; 2000US-022 37 52P . 

PR 04-AUG-2000; 2000US-0223762P . 

PR 04-AUG-2000; 2000US-022 37 69P . 

PR 04-AUG-2000; 2000US-0223770P . 

PR 14-AUG-2000; 2000US-0225146P . 

PR 15-AUG-2000; 2000US-0225392P . 

PR 15-AUG-2000; 2000US-0225470P . 

PR 16-AUG-2000; 2000US-0225697P . 

PR 01-FEB-2001; 2001US-0263662P . 

PR 05-APR-2001; 2001US-0281645P . 
XX 

PA (CURA-) CURAGEN CORP. 
XX 

PI Padigaru M, Mezes P, Mishra V, Burgess C, Casman S, Grosse WM; 

PI Alsobrook JP, Lepley DM, Gerlach VL, Macdougall JR, Smithson G; 
XX 

DR WPI; 2002-180074/23. 

DR N-PSDB; ABK37922. 
XX 

PT New isolated cytoplasmic, nuclear, membrane bound, or secreted 

PT polypeptide, useful for treating cardiomyopathy, atherosclerosis, 

PT infections, cancer, neurodegenerative, metabolic, hematopoietic and 

PT immune disorders. 
XX 

PS Claim 1; Page 11; 213pp; English. 
XX 

CC The invention relates to an isolated cytoplasmic, nuclear, membrane 

CC bound, or secreted polypeptide (NOVX, x= 1-14) their variants or mature 

CC form. Also included are the nucleic acids encoding the NOVX proteins, a 

CC vector comprising the nucleic acid, a cell comprising the vector, an anti 

CC -NOVX antibody and modulators of NOVX. NOVX, the nucleic acid and the 

CC antibody are useful for treating or preventing a NOVX-associated 

CC disorder, where the disorder is selected from cardiomyopathy, 

CC atherosclerosis, diabetes, a disorder related to cell signal processing 

CC and metabolic pathway modulation, metabolic disorders, obesity, 

CC infectious disease, anorexia, cancer-associated cachexia, cancer, 

CC neurodegenerative disorders, Alzheimer's disease, Parkinson's disease, 

CC immune disorders, haematopoietic disorders, and the various 

CC dyslipidaemias, metabolic disturbances associated with obesity, the 

CC metabolic syndrome X and wasting disorders associated with chronic 

CC diseases, bacterial, fungal, protozoal and viral infections, pain, 

CC bulimia, asthma, hypertension, urinary retention, osteoporosis, Crohn's 



CC disease, multiple sclerosis, Albright Hereditary Osteodystrophy, angina 

CC pectoris, myocardial infarction, ulcer, allergy, benign prostatic 

CC hypertrophy, and psychotic and neurological disorders, including anxiety, 

CC schizophrenia, manic depression, delirium, dementia, and dyskinesias, 

CC such as Huntington's disease and Gilles de la Tourette 1 s syndrome. The 

CC nucleic acid is useful in gene therapy. The present sequence represents a 

CC NOVX protein 
XX 

SQ Sequence 898 AA; 

Query Match 100.0%; Score 4791; DB 5; Length 898; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 898; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 mVRPGLWPALLGIVLAAWLRGSGAQQSATV7^PVPGANPDLLPHFLVEPEDVTIVT<NKP 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MAVRP GLW P AL LG I VLAAWLRG S GAQQ SAT VAN P VP GAN P D LL P H FLVE P E DVY I VKNK P 60 

Qy 61 VLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLE 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 VLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLE 120 

Qy 121 EYWCQCVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAE 180 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 EYWCQCVAWSSSGTTKSQKAYIRIARLRKNFEQEPL7VKEVSLEQGIVLPCRPPEGIPPAE 180 

Qy 181 VEWLRNEDLVDPSLDPNWITREHSLVVRQARLADTANYTCVAKNIVARRRSASAAVIVY 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 
Db 181 VEWLRNEDLVDPSLDPNWITREHSLWRQARI^TANYTCVAKNIVARRRSASTVAVIVY 240 

Qy 241 VNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGS 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 VNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGS 300 

Qy 301 WSPWSKWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSASGPEDVA 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I 
Db 301 WSPWSKWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSASGPEDVA 360 

Qy 361 LYVGLIAVAVCLVLLLLVLILVYCRKKEGLDSDVADSSILTSGFQPVSIKPSKADNPHLL 420 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 

Db 361 LYVGLI AVAVCLVLLLLVL I LVYCRKKEGLDS DVADS S I LT S GFQPVS I KP S KADNPHLL 420 

Qy 421 TIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSSPTSEAEEFVS 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 421 TIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSSPTSEAEEFVS 480 

Qy 481 RLSTQNYFRSLPRGTSNMTYGT FNFLGGRLMIPNTGISLLIPPDAIPRGKIYEIYLTLHK 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 481 RLSTQNYFRSLPRGTSNMTYGT FNFLGGRLMIPNTGISLLIPPDAIPRGKIYEIYLTLHK 540 

Qy 541 PEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVILAMDHCGEPSPDSWSLRLKKQSCEGSW 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 541 PEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVILAMDHCGEPSPDSWSLRLKKQSCEGSW 600 

Qy 601 EDVLHLGEEAPSHLYYCQLEASACWFTEQLGRFALVGEALSVAAAKRLKLLLFAPVACT 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 
Db 601 EDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALSVAAAKRLKLLLFAPVACT 660 



Qy 661 SLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLW 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 661 SLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLW 72 0 

Qy 721 KSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWVWQVEGDGQSFSINF 78 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 721 KSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWVWQVEGDGQSFSINF 780 

Qy 781 NITKDTRF7VELLALESEAGVP7VLVGPSAFKIPFLIRQKIISSLDPPCRRGADWRTLAQKL 840 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 781 NITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISSLDPPCRRGADWRTLAQKL 840 

Qy 841 HLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVSEAEC 898 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 841 HLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVSEAEC 898 



AAU97899; 

27-AUG-2002 (first entry) 

Human netrin binding membrane receptor UNC5H-1 protein. 

Netrin binding membrane receptor; receptor; UNC5H-1; human; nootropic- 
neuroprotective; cytostatic; antiparkinsonian; cerebroprotective; cancer; 
central nervous system; CNS; stroke; Parkinson's disease; 
multiple sclerosis; Alzheimer's disease. 



RESULT 2 
AAU97899 

ID AAU97899 standard; protein; 898 AA. 
XX 
AC 
XX 
DT 
XX 
DE 
XX 
KW 
KW 
KW 
KW 
XX 
OS 
XX 
FH 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
XX 
PN 
XX 
PD 
XX 
PF 
XX 
PR 
XX 



Homo sapiens. 
Key 

Domain 
Domain 
Domain 
Region 
Domain 
Domain 



Location/Qualifiers 
152. .223 

/note= "Immunoglobulin domain " 
247. .294 

/note= "Thrombospondine type 1 domain " 
302. .348 

/note= "Thrombospondine type 1 domain" 
361. .382 

/note= "Transmembrane region" 
495. .598 

/note= "ZU5 domain" 
817. .897 

/note= "Death domain" 



WO200233080-A2. 
25-APR-2002. 

15- OCT-2001; 2001WO-EP011891 . 

16- OCT-2000; 2000US-0240061P . 



PA (FARB ) BAYER AG. 
XX 

PI Koehler RH; 
XX 

DR WPI; 2002-463314/49. 

DR N-PSDB; ABK52891. 
XX 

PT Novel human netrin binding membrane receptor polypeptide and 

PT polynucleotides for identifying modulating agents useful in treating 

PT diseases e.g. Parkinson's disease, multiple sclerosis, stroke, 

PT Alzheimer's disease. 

XX 

PS Claim 1; Fig 2; 94pp; English. 
XX 

CC This invention relates to the DNA and protein sequences of a novel 

CC purified human netrin binding membrane receptor, UNC5H-1. The DNA 

CC sequence of the invention is useful as a probe for detecting a nucleic 

CC acid encoding the UNC5H-1 protein in a biological sample. The sequences 

CC of the invention are useful to screen for agents which decrease the 

CC activity of the UNC5H-1 protein. The sequences are also useful for 

CC screening agents which regulate (modulate) the activity of the protein of 

CC the invention. A pharmaceutical composition containing the protein of the 

CC invention or a reagent that modulates the activity of the UNC5H-1 protein 

CC may be useful for treating a UNC5H-1 dysfunction related disease such as 

CC cancer or a central nervous system (CNS) disorders (e.g, Parkinson's 

CC disease, multiple sclerosis, stroke and Alzheimer's disease). Fusion 

CC proteins comprising the UNC5H-1 protein are useful for generating 

CC antibodies and for in various assay systems, and the protein can be used 

CC as a bait protein in a two-hybrid assay or three-hybrid assay. The method 

CC of the invention is useful for detecting a coding sequence for the UNC5H- 

CC 1 protein. The present sequence represents the human netrin binding 

CC membrane receptor UNC5H-1 protein of the invention 

XX 

SQ Sequence 898 AA; 

Query Match 99.8%; Score 4781; DB 5; Length 898; 

Best Local Similarity 99.8%; Pred. No. 0; 

Matches 896; Conservative 1; Mismatches 1; Indels 0; Gaps 0 
Qy 1 MAVRPGLWPALLGIVLAAWLRGSGAQQSATVANPVPGANPDLLPHFLVEPEDVYIVKNKP 60 



Db 



I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1 MAVRPGLWP7U.LGIVLAAWLRGSGAQQSATVANPVPGANPDLLPHFLVEPEDVYIVKNKP 60 



Qy 



61 VLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLE 120 




Db 



61 VLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLE 120 



Qy 



121 EYWCQCVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAE 180 





Db 



121 EYWCQCVAWSSSGTTKSQKAYIRIAYLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAE 180 



Qy 



Db 



181 VEWLRNEDLVD P S LD PNVYI T REHS L WRQARLADTAN YT CVAKN I VARRRS AS AAVI VY 240 

I I 1 I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
181 VEWLRNEDLVDPSLDPNVYITREHSLWRQARLADTANYTCVAKNIVARRRSASAAVIVY 240 



Qy 



241 VNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGS 300 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 



Db 241 VNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGS 300 

Qy 301 WSPWSKWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSASGPEDVA 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I 

Db 301 WSPWSKWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHTASGPEDVA 360 

Qy 361 LWGLIAVAVCLVLLLLVLILWCRKKEGLDSDVADSSILTSGFQPVSIKPSKADNPHLL 42 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 361 LYVGLI AVAVCLVLLLLVLI LVYCRKKEGLDSDVADS S I LTSGFQPVS I KPSKADNPHLL 420 

Qy 421 TIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSSPTSEAEEFVS 480 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 421 TIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSSPTSEAEEFVS 480 

Qy 481 RLSTQNYFRSLPRGTSNMTYGTFNFLGGRLMIPNTGISLLIPPDAIPRGKIYEI YLTLHK 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 

Db 481 RLSTQNYFRSLPRGTSNMTYGTFNFLGGRLMIPNTGISLLIPPDAIPRGKIYEI YLTLHK 540 

Qy 541 PEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVI LAMDHCGEPSPDSWSLRLKKQSCEGSW 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 541 PEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVI LAMDHCGEPSPDSWSLRLKKQSCEGSW 600 

Qy 601 EDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALSVAAAKRLKLLLFAPVACT 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 601 EDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALSVAAAKRLKLLLFAPVACT 660 

Qy 661 SLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLW 720 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 661 SLEYNIRWCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLW 720 

Qy 721 KSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWVWQVEGDGQSFSINF 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 721 KS KLLVS YQEI P FYHI WNGTQRYLHCTFTLERVS P STS DLACKLWVWQVEGDGQS FS INF 780 

Qy 781 NITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISSLDPPCRRGADWRTLAQKL 840 

I I I I I I I I I I I II I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I 

Db 781 NITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISSLDPPCRRGADWRTLAQKL 840 

Qy 841 HLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVSE7VEC 898 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 841 HLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVSEAEC 898 



RESULT 3 
AAU79939 

ID AAU79939 standard; protein; 899 AA. 
XX 

AC AAU79939; 
XX 

DT 15-JUL-2002 (first entry) 
XX 

DE Human UNC5-like protein NOV1 . 
XX 

KW Human; NOVX polypeptide; cardiomyopathy; atherosclerosis; cancer; 

KW cell signal processing; metabolic pathway modulation; cancerous tissue; 

KW antibody; diabetes; transgenic animal; UNC5-like protein; NOV1; 

KW chromosome 13. 



XX 

OS Homo sapiens.. 
XX 

PN WO200229038-A2. 
XX 

PD ll-APR-2002. 
XX 

PF 04-OCT-2001; 2001WO-US031377 . 
XX 

PR 04-OCT-2000; 2000US-0237862P . 
XX 

PA (CURA-) CURAGEN CORP. 
XX 

PI Herrmann JL, Rastelli L, Shimkets RA; 
XX 

DR WPI; 2002-340104/37. 

DR N-PSDB; ABK49422. 
XX 

PT Novel isolated NOVX polypeptide, and encoded polynucleotide, useful for 

PT treating cardiomyopathy, artherosclerosis , and cancer. 

XX 

PS Claim 1; Page 9; 180pp; English. 
XX 

CC The present invention relates to a new NOVX polypeptide having a 900 

CC (NOV1), 4349 (NOV2), 940 (NOV3), 798 (NOV4), 865 (NOV5) , or 331 (NOV6) 

CC residue amino acid sequence, as given in the specification. The novel 

CC polypeptide, and its encoding polynucleotide, are used to treat 

CC cardiomyopathy, atherosclerosis, cancer or a disease related to cell 

CC signal processing and metabolic pathway modulation, in a human. Detecting 

CC the polypeptide or polynucleotide is useful for identifying cancerous 

CC tissue. The antibody can be used to treat diabetes or cancer. The host 

CC cells can be used to produce non-human transgenic animals useful in drug 

CC screening. The present amino acid sequence is that of the human UNC5-like 

CC protein NOV1 of the invention. This sequence is encoded by the human UNC5 

CC -like NOV1 gene located on chromosome 13 

XX 

SQ Sequence 899 AA; 

Query Match 98.1%; Score 4698.5; DB 5; Length 899; 
Best Local Similarity 98.7%; Pred. No. 0; 

Matches 888; Conservative 2; Mismatches 7; Indels 3; Gaps 3 

Qy 1 MAVRPGLWPALLGIVLAAWLRGSGAQQSATVANPVPGANPDLLPHFLVEPEDVYIVKNKP 60 

I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I 

Db 1 MAVRPGLWPALLGIVLAAWLRGSGAQQSATVANPVPGANPDLLPHFLVEPEDVYIVKNKP 60 

Qy 61 VLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLE 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I 

Db 61 VLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGEPTMEVRINVSRQQVEKVFGLE 120 

Qy 121 EYWCQCVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAE 180 

I I I I I I I I I II I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 EYWCQCVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAE 180 



Qy 181 VEWLRNEDLVDP S LDPNVYI T REHSL VVRQARLADTANYTCVAXNI VARRRS ASAAVI VY 240 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
Db 181 VEWLRN EDLVD P S LD PNVYI T REH S L WRQARLADTAN YT CVAKN I VARRRS AS AAVT VY 240 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 



241 VNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNV-QKTACATLCPVDG 299 

I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I : I : I III 

241 VNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVHDRTVSSLLVSVDG 300 

300 SWSPWSKWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSASGPEDV 359 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

301 SWSPWSKWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSASGPEDV 360 

360 AL WGLI AVAVCLVLLLLVLI L VYCRKKEGLDSDVADSS I LT SGFQPVS I KP S KADNPHL 419 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

361 ALWGLIAVAVCLVLLLLVLILVYCRKKEGLDSDVADSSILTSGFQPVSIKPSKADNPHL 420 



420 



479 



LTIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSSPTSEAEEFV 
II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
421 LTIQPDLS-TTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSSPTSEAEEFV 479 

4 80 S RLSTQNYFRSLPRGTSNMTYGTFNFLGGRLMI PNTGI SLLI PPDAI PRGKI YEI YLTLH 539 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
4 80 SRLSTQNYFRSLPRGTSNMTYGTFNFLGGRLMI PNTGI SLLI PPDAI PRGKI YEI YLTLH 539 

540 KPEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVI LAMDHCGEPSPDSWSLRLKKQSCEGS 599 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

540 KPEDVRLPLAGCQTLLS PI VSCGPPGVLLTRPVI LAMDHCGEPSPDSWSLRLKKQSCEGS 599 

600 WE-DVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALSVAAAKRLKLLLFAPVA 658 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

600 WEQDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALSVAAAKRLKLLLFAPVA 659 

659 CTSLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSS 718 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

660 CTSLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSS 719 

719 LWKSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWVWQVEGDGQSFSI 77 8 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

720 LWKSKLLVS YQE I P FYH I WNGTQRYLHCT FTLERVS P ST S DLACKLWVWQVEGDGQS FS I 77 9 

779 NFNITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISSLDPPCRRGADWRTLAQ 838 
I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

780 NFNITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISSLDPPCRRGADWRTLAQ 839 

839 KLHLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVSEAEC 898 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

840 KLHLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVSEAEC 899 



RESULT 4 
AAW78898 
ID 
XX 
AC 
XX 
DT 
DT 
XX 
DE 
XX 



AAW78898 standard; protein; 898 AA. 



AAW78898; 

25-MAR-2003 
21-DEC-1998 



(revised) 
(first entry) 



Rat UNC-5 homologue UNC5H-1. 



KW UNC-5; UNC5H-1; rat; netrin receptor; cell migration; axon guidance; 

KW diagnosis; therapy. 

XX 

OS Rattus sp. 
XX 

FH Key Location/Qualifiers 

FT Peptide 580. .594 

FT /note= "peptide used to raise rabbit polyclonal antisera" 
XX 

PN WO9837085-A1. 
XX 

PD 27-AUG-1998. 
XX 

PF 19-FEB-1998; 98WO-US003143 . 
XX 

PR 19-FEB-1997; 97US-00808982 . 
XX 

PA (REGC ) UNIV CALIFORNIA. 
XX 

PI Tessier-Lavigne M, Leonardo ED, Hinck L, Masu M, Keinomasu K; 
XX 

DR WPI; 1998-495364/42. 

DR N-PSDB; AAV52940. 
XX 

PT Netrin-binding, vertebrate proteins - useful for diagnosis, therapy and 

PT the biopharmaceutical industry. 

XX 

PS Claim 1; Page 19-22; 32pp; English. 
XX 

CC UNC5H-1 and UNC5H-2 (see AAW78900) are rat homologues of Caenorhabditis 

CC elegans UNC-5 protein. Their amino acid sequences were deduced from 

CC isolated unc5h cDNA clones (see AAV52940 and AAV52942) isolated from an 

CC E18 brain cDNA library. The predicted proteins show similarity with UNC- 

CC 5, possess 2 predicted Ig-like domains and 2 predicted thrombospondin 

CC type-1 repeats, a predicted membrane spanning region, and a large 

CC intracellular domain. They are predicted to be involved in cell migration 

CC and axon guidance, and are characterised as receptor proteins for 

CC netrins. Human UNC5H-1 (see AAW78899) and UNC5H-2 (see AAW78901) proteins 

CC are also claimed. Vertebrate UNC-5 proteins may be produced recombinantly 

CC from transfected host cells. The invention also provides unc-5 

CC hybridisation probes and primers, vertebrate UNC-5-specif ic binding 

CC agents such as specific antibodies, and methods of making and using the 

CC subject compositions in diagnosis (e.g. genetic hybridisation screens for 

CC vertebrate unc-5 transcripts), therapy (e.g. gene therapy to modulate 

CC vertebrate unc-5 gene expression) and in the biopharmaceutical industry 

CC (e.g. as immunogens , reagents for modulating cell guidance, reagents for 

CC screening chemical libraries for lead pharmacological agents, etc.). 

CC (Updated on 25-MAR-2003 to correct PI field.) 

XX 

SQ Sequence 898 AA; 

Query Match 96.8%; Score 4638; DB 2; Length 898; 
Best Local Similarity 96.0%; Pred. No. 0; 

Matches 862; Conservative 17; Mismatches 19; Indels 0; Gaps 0; 

Qy 1 MAVRPGLWPALLGIV^WVWLRGSGAQQSATVANPVPGANPDLLPHFLVEPEDVYIVKNKP 60 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 



DD 


1 


Qy 


61 


Db 


61 


Qy 


121 


Db 


121 


Qy 


181 


Db 


181 


Qy 


241 


Db 


241 


Qy 


301 


Db 


301 


Qy 


361 


Db 


361 


Qy 


421 


Db 


421 


Qy 


481 


Db 


481 


Qy 


541 


Db 


541 


Qy 


601 


Db 


601 


Qy 


661 


Db 


661 


Qy 


721 


Db 


721 


Qy 


781 


Db 


781 


Qy 


841 


Db 


841 



1 MAVRPGLWPVLLGIVLAAWLRGSGAQQSATVANPVPGANPDLLPHFLVEPEDVYIVKNKP 60 



VLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLE 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

VLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDSSSGLPTMEVRINVSRQQVEKVFGLE 

KSIVLPCRPPEGIPPAE 
M I I I I I I I I I I I I I I 
yRKNFEQEPLAKEVSLEpGIVLPCRPPEGIPPAE 

VEWLRNEDLVDP S LDPNVYI TREHS LWRQARLADTANlTCVJ 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



EYttCQCVAWs\$£GTT^^ 

I I I I I I I I I I I I I I I I I I I I I I I l/l l) I \ \\ I I I I I I I 11 I I 

EYWCQCVAWS S S GTTKSQKAYI RIlAY 



VEWLRNEDLVDP S LDPNVYI TREH S LWRQARLADTAN^ TCVi 




:nivarr 

I I I I I I 



IVAR XRS 



sAAVIX/Y* 

II I I I I I I 

dAAV^VY 



120 



120 



180 



180 



240 



240 



vnggwsItwtewsvcsascgrgwqkrsrsctnpaplnggafcegqnvqktacatlcp VDGS 300 



1 1 1 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 



1 1 1 1 



VNGGWSJL'WTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCP VDGS 300 



II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I : I 2 I I Mill 

WSSWSKWSACGLDCTHWRSRECSDPAPRNGGEECRGADLDTRNCTSDLCLHTASCPEDVA 360 

|A?5uLLLVLILVYCRKKEGLDSDVAX)SSILTSGFQPVSIKPSKADNPHLL 42 0 
I I : I I : I I I I I I 1)1 I I I I : I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

LFllLLLALGLIYCRKKEGLDSDVADSSILTSGFQPVSIKPSKADNPHLL 42 0 



TIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSSPTSEAEEFVS 
I I I I I I I I I I I I I I I I II I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I II I : I I I 
TIQPDLSTTTTTYQGS^CSRQDGPSPKFQLSNGHLLSPLGSGRHTLHHSSPTSEAEDFVS 



GTBNM 



480 



480 



RL&TQNYFRsjLP^ 54 0 

II I II I I II ill I II III I if If I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

RLSTQNYFR^LPRGTSNliAY^TFNFLGGRLMIPNTGISLLIPPDAIPRGKIYEIYLTLHK 54 0 



I I I I I I I I I I II I I I I I : I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



EDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALSVAA7VKRLKLLLFAPVACT 660 
I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I 



I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 



I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I : : I I I I I I I I : I I I I I I I I I I I I : I I I 



NITKDTRFAELLALE 

I I I I I I I I I I I I I I I I 

NITKDTRFAELLALES 




EAGV^ALVGPSAFKIPFLIRQKI 
I l/l I I I I I I I 1 I I I i I I I I I I 
GGVPALVGPSAFKIPFLIRQKI 



:^LDPPdkRGADWRTLAQKL 840 

: \ I I I I IM I I I I I I I I I I I 
.:ASLDPPclsRGADWRTLAQKL 84 0 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 



RESULT 5 
AAU10543 

ID AAU10543 standard; protein; 898 AA. 
XX 

AC AAU10543; 
XX 

DT 14-FEB-2002 (first entry) 
XX 

DE Rat netrin receptor UNC5H1 (YSG7) polypeptide. 
XX 

KW YSG; YSG7; schizophrenia; chronic animal model; LCGU; netrin receptor; 

KW local cerebral glucose utilisation; phosphodiesterase 1-alpha; UNC5H1; 

KW calcium-independent alpha-latrotoxin receptor; CIRL; trkE; synapsin 1A; 

KW epithelial discoidin domain receptor 1; synapsin IB; neuroleptic; 

KW tumour necrosis factor alpha; TNF-alpha; rat. 
XX 

OS Rattus sp. 
XX 

PN WO200175440-A2. 
XX 

PD ll-OCT-2001. 
XX 

PF 02-APR-2001; 2001WO-GB0014 86 . 
XX 

PR 31-MAR-2000; 2000GB-00007880 . 

PR 26-MAY-2000; 2000GB-00012768 . 
XX 

PA (WELF-) WELFIDE CORP. 
XX 

PI Cochran S, Paterson G, Ohashi Y, Morris B, Pratt J; 
XX 

DR WPI; 2002-010813/01. 

DR N-PSDB; AAS16843. 
XX 

PT Novel chronic animal model of schizophrenia, useful for identifying anti- 

PT psychotic drugs and genes that are associated with schizophrenia. 

XX 

PS Disclosure; Fig 8b; 79pp; English. 
XX 

CC The invention relates to YSG polynucleotide fragments for use in 

CC diagnosing and/or developing treatments for schizophrenia using chronic 

CC animal models. The polynucleotides and their encoded polypeptides are 

CC used for identification of compounds which modulate the expression of YSG 

CC molecules, leading to the manufacture of schizophrenia medicaments. The 

CC sequences can also be used for testing candidate compounds for any effect 

CC on the polypeptides. Anti-schizophrenic effects of a compound can be 

CC determined by measuring local cerebral glucose utilisation (LCGU) or 

CC comparing its expression level with that of a control group. The 

CC sequences are useful in the identification of genes associated with 

CC schizophrenic states and in the development of an antibody. The sequences 

CC of the invention include phosphodiesterase 1-alpha, calcium-independent 

CC alpha-latrotoxin receptors (CIRL) -1, 2&3, epithelial discoidin domain 

CC receptor 1 (trkE), netrin receptor (UNC5H1), synapsins 1A and AB and 

CC tumour necrosis factor (TNF) alpha. This sequence represents rat netrin 

CC receptor UNC5H1 (YSG7) polypeptide 



XX 

SQ Sequence 8 98 AA; 



Query Match 96.8%; Score 4638; DB 5; Length 898; 

Best Local Similarity 96.0%; Pred. No. 0; 

Matches 862; Conservative 17; Mismatches 19; Indels 0; Gaps 0; 

Qy 1 mVRPGLWPALLGIVLAAWLRGSGAQQSATVANPVPGANPDLLPHFLVEPEDVYIVKNKP 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 mVRPGLWPVXLGIVTAAWLRGSGAQQSATVANPVPGANPDLLPHFLVEPEDVYIVKNKP 60 

Qy 61 VLLVCPCAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLE 120 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I II I I I I I I II I 

Db 61 VL L VC KAVP AT Q I F FKCN G EWVRQ VDH VI ERSTDSSSGLP TME VR I N VS RQQ VE KVFGL E 120 

Qy 121 EYWCQCVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAE 180 

I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 121 EYWCQCVAWSSSGTTKSQKAYIRIAYLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAE 180 

Qy 181 VEWLRNED LVD P S LD PN VYI T REH S LWRQARLADTAN YT CVAKN I VARRRS AS AAVI VY 240 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 VE W L RN E D LVD P S L D PN VY I T REH S LWRQARLADTAN YT CVAKN I VARRRS T S AAVI VY 24 0 

Qy 241 VN G GW S T WT EW S VC S AS C G RGWQ KRSRSCTNPAPLN GG AF C E GQN VQ KT AC AT L C P VD G S 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 241 VN GGW S T WT EW S VC S AS C GRGWQ K RS R S CTN P AP LN GGAFC E GQN VQ KT AC AT L C P VD G S 300 

Qy 301 WSPWSKWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSASGPEDVA 360 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II : I I I I I I I I I I I I I : I : I I I I I I I 

Db 301 WSSWSKWSACGLDCTHWRSRECSDPAPRNGGEECRGADLDTRNCTSDLCLHTASCPEDVA 360 

Qy 361 LYVGLIAVAVCLVLLLLVLI LVYCRKKEGLDSDVADS SI LTSGFQPVS I KP S KADN PHLL 420 

11:11:111111 I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 LYIGLVAVAVCLFLLLLALGLIYCRKKEGLDSDVADS SI LTSGFQPVS I KPSKADNPHLL 42 0 

Qy 421 TIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSSPTSEAEEFVS 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I : I I I 
Db 421 TIQPDLSTTTTTYQGSLCSRQDGPSPKFQLSNGHLLSPLGSGRHTLHHSSPTSEAEDFVS 480 

Qy 481 RLSTQNYFRSLPRGTSNMTYGTFNFLGGRLMIPNTGISLLIPPDAIPRGKIYEIYLTLHK 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 481 RLSTQNYFRSLPRGTSNMAYGTFNFLGGRLMIPNTGISLLIPPDAIPRGKIYEIYLTLHK 540 

Qy 541 PEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVI LAMDHCGEPSPDSWSLRLKKQSCEGSW 600 

I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 541 PEDVRLPLAGCQTLLSPWSCGPPGVLLTRPVI LAMDHCGEPSPDSWSLRLKKQSCEGSW 600 

Qy 601 EDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALSVAAAKRLKLLLFAPVACT 660 

I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I 
Db 601 EDVLHLGEESPSHLYYCQLEAGACYVFTEQLGRFALVGEALSVAATKRLRLLLFAPVACT 660 

Qy 661 SLEYNI RVYCLHDTHDALKE VVQLEKQLGGQLIQEPRVLHFKDS YHNLRLS IHDVP S SLW 720 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I II I II I I 
Db 661 SLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLW 720 

Qy 721 KSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWVWQVEGDGQSFSINF 780 

I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I :: I I I I I I I I : I I I I I I I I I I I I : I I I 



Db 721 KSKLLVSYQEIPFYHIWNGTQQYLHCTFTLERINASTSDLACKVWWQVEGDGQSFNINF 780 



Qy 781 NITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISSLDPPCRRGADWRTLAQKL 840 

I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I 

Db 781 NITKDTRFAELLALESEGGVPALVGPSAFKIPFLIRQKIIASLDPPCSRGADWRTLAQKL 840 

Qy 841 HLDSHLS FFASKPS PTAMI LNLWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVSEAEC 898 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 841 HLDSHLS FFASKPS PTAMI LNLWEARHFPNGNLGQLAAAVAGLGQPDAGL FT VSEAEC 898 



RESULT 6 
AAU97900 

ID AAU97900 standard; protein; 898 AA. 
XX 

AC AAU97900; 
XX 

DT 27-AUG-2002 (first entry) 
XX 

DE Rat netrin binding membrane receptor UNC5H-1 protein. 
XX 

KW Netrin binding membrane receptor; receptor; UNC5H-1; Rat; nootropic; 

KW neuroprotective; cytostatic; antiparkinsonian; cerebroprotective; cancer; 

KW central nervous system; CNS; stroke; Parkinson's disease; 



KW 


multiple sclerosis; Alzheimer's disease. 




XX 








OS 


Rattus sp. 






XX 








FH 


Key 


Location/ Qualifiers 




FT 


Domain 


152. .223 




FT 




/note- "Immunoglobulin domain 


ii 


FT 


Domain 


247. .294 




FT 




/note= "Thrombospondine type 1 


domain 


FT 


Domain 


302. .348 




FT 




/note= "Thrombospondine type 1 


domain 


FT 


Region 


361. .382 




FT 




/note= "Transmembrane region" 




FT 


Domain 


495. .598 




FT 




/note= "ZU5 domain" 




FT 


Domain 


817. .897 




FT 




/note= "Death domain" 




XX 








PN 


WO200233080- 


A2. 




XX 








PD 


25-APR-2002. 






XX 








PF 


15-OCT-2001; 


2001WO-EP011891. 




XX 








PR 


16-OCT-2000; 


2000US-0240061P. 





XX 

PA (FARB ) BAYER AG. 
XX 

PI Koehler RH; 
XX 

DR WPI; 2002-463314/49. 
XX 

PT Novel human netrin binding membrane receptor polypeptide and 



PT polynucleotides for identifying modulating agents useful in treating 

PT diseases e.g. Parkinson's disease, multiple sclerosis, stroke, 

PT Alzheimer's disease. 
XX 

PS Disclosure; Fig 3; 94pp; English. 
XX 

CC This invention relates to the DNA and protein sequences of a novel 

CC purified human netrin binding membrane receptor, UNC5H-1. The DNA 

CC sequence of the invention is useful as a probe for detecting a nucleic 

CC acid encoding the UNC5H-1 protein in a biological sample. The sequences 

CC of the invention are useful to screen for agents which decrease the 

CC activity of the UNC5H-1 protein. The sequences are also useful for 

CC screening agents which regulate (modulate) the activity of the protein of 

CC the invention. A pharmaceutical composition containing the protein of the 

CC invention or a reagent that modulates the activity of the UNC5H-1 protein 

CC may be useful for treating a UNC5H-1 dysfunction related disease such as 

CC cancer or a central nervous system (CNS) disorders (e.g, Parkinson's 

CC disease, multiple sclerosis, stroke and Alzheimer's disease). Fusion 

CC proteins comprising the UNC5H-1 protein are useful for generating 

CC antibodies and for in various assay systems, and the protein can be used 

CC as a bait protein in a two-hybrid assay or three-hybrid assay. The method 

CC of the invention is useful for detecting a coding sequence for the UNC5H- 

CC 1 protein. The present sequence represents the Rat netrin binding 

CC membrane receptor UNC5H-1 protein of the invention 

XX 

SQ Sequence 898 AA; 

Query Match 96.8%; Score 4638; DB 5; Length 898; 

Best Local Similarity 96.0%; Pred. No. 0; 

Matches 862; Conservative 17; Mismatches 19; Indels 0; Gaps 0; 
Qy 1 MAVRPGLWP7VLLGIVLAAWLRGSGAQQSATVANPVPGANPDLLPHFLVEPEDVYIVKNKP 60 



Db 



I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1 MAVRPGLWPVT,LGIVLAAWLRGSGAQQSATVANPVPGANPDLLPHFLVEPEDVYIVKNKP 60 



Qy 



61 VLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLE 12 0 





Db 



61 VLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDSSSGLPTMEVRINVSRQQVEKVFGLE 120 



Qy 



Db 



121 EYWCQCVAWS S SGTTKSQKAYI RI ARLRKNFEQEPLAKEVSLEQGI VLPCRP PEGI PPAE 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
121 EYWCQCVAWS S SGTTKSQKAYI RIAYLRKNFEQEPLAKEVSLEQGI VLPCRP PEGI PPAE 180 



Qy 



181 VEWLRNEDLVDPSLDPNVYITREHSLWRQARLADTANYTCVAKNIVARRRSASAAVIVY 240 




Db 



181 WWLRNEDLVT)PSLDPNWITREHSLWRQARLADTANYTCVAKNIVARRRSTS7VAVIW 240 



Qy 



241 VNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGS 300 




Db 



241 VNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGS 300 



Qy 



301 WSPWSKWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSASGPEDVA 360 





Db 



301 WSSWSKWSACGLDCTHWRSRECSDPAPRNGGEECRGADLDTRNCTSDLCLHTASCPEDVA 360 



Qy 



361 LYVGLIAVAVCLVLLLLVLILVYCRKKEGLDSDVADSSILTSGFQPVSIKPSKADNPHLL 420 
I I : I I : I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 



361 



LYIGLVAVAVCLFLLLLALGLIYCRKKEGLDSDVADSSILTSGFQPVSIKPSKADNPHLL 420 



Qy 421 TIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSSPTSEAEEFVS 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I : I I I 

Db 421 TIQPDLSTTTTTYQGSLCSRQDGPSPKFQLSNGHLLSPLGSGRHTLHHSSPTSEAEDFVS 480 

Qy 481 RLSTQNYFRSLPRGTSNMTYGTFNFLGGRLMIPNTGISLLIPPDAIPRGKIYEIYLTLHK 54 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 481 RLSTQNYFRSLPRGTSNMAYGTFNFLGGRLMIPNTGISLLIPPDAIPRGKIYEIYLTLHK 54 0 

Qy 541 PEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVILAMDHCGEPSPDSWSLRLKKQSCEGSW 600 

I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 541 PEDVRLPLAGCQTLLSPWSCGPPGVLLTRPVI LAMDHCGEPSPDSWSLRLKKQSCEGSW 600 

Qy 601 EDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGET^LSVATVAKRLKLLLFAPVACT 660 

I I I I I I I I I : I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 111:1111111111 
Db 601 EDVLHLGEESPSHLYYCQLEAGACYVFTEQLGRFALVGEALSVAATKRLRLLLFAPVACT 660 

Qy 661 SLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLW 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 661 SLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLW 720 

Qy 721 KSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWVWQVEGDGQSFSINF 780 

I I I I I I I I I I I I I I I I I I I I I : I I I I I I II I I : : I I I I I I I I : I I I I I I I I I I I I : I I I 
Db 721 KSKLLVSYQEIPFYHIWNGTQQYLHCTFTLERINASTSDLACKVWVWQVEGDGQSFNINF 780 

Qy 781 NITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISSLDPPCRRGADWRTLAQKL 840 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I II I I I I I I 
Db 781 NITKDTRFAELLALESEGGVPALVGPSAFKIPFLIRQKIIASLDPPCSRGADWRTLAQKL 840 

Qy 841 HLDSHLS FFASKPS PTAMI LNLWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVS EAEC 898 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 841 HLDSHLS FFASKPS PTAMI LNLWEARHFPNGNLGQLAAAVAGLGQPDAGLFTVSEAEC 898 



RESULT 7 


AAM79128 


ID 


AAM79128 standard; protein; 943 AA. 


XX 




AC 


AAM79128; 


XX 




DT 


06-NOV-2001 (first entry) 


XX 




DE 


Human protein SEQ ID NO 1790. 


XX 




KW 


Human; cytokine; cell proliferation; cell differentiation; gene therapy; 


KW 


vaccine; peptide therapy; stem cell growth factor; haematopoiesis ; 


KW 


tissue growth factor; immunomodulatory; cancer; leukaemia; 


KW 


nervous system disorder; arthritis; inflammation. 


XX 




OS 


Homo sapiens. 


XX 




PN 


WO200157190-A2. 


XX 




PD 


09-AUG-2001. 


XX 




PF 


05-FEB-2001; 2001WO-US004098 . 



XX 



PR 


03 


-FEB- 


2000; 


2000US- 


00496914. 


PR 


27 


-APR- 


2000; 


2000US- 


00560875. 


PR 


20 


-JUN- 


2000; 


2000US- 


00598075. 


PR 


19 


-JUL- 


2000; 


2000US- 


00620325. 


PR 


01 


-SEP- 


2000; 


2000US- 


00654936. 


PR 


15 


~SEP- 


2000; 


2000US- 


00663561. 


PR 


20 


-OCT- 


2000; 


2000US- 


00693325. 


PR 


30 


-NOV-2000; 


2000US- 


00728422. 



XX 

PA (HYSE-) HYSEQ INC. 
XX 

PI Tang YT, Liu C, Drmanac RT, Asundi V, Zhou P, Xu C, Cao Y; 

PI Ma Y, Zhao QA, Wang D, Wang J, Zhang J, Ren F, Chen R, Wang ZW; 

PI Xue A J, Yang Y, Wejhrman T, Goodrich R; 

XX 

DR WPI; 2001-476283/51. 

DR N-PSDB; AAK52261. 
XX 

PT Nucleic acids encoding polypeptides with cytokine-like activities, useful 

PT in diagnosis and gene therapy. 

XX 

PS Claim 20; Page 4148-4150; 6221pp; English. 
XX 

CC The invention relates to polynucleotides (AAK51456-AAK53435 ) and the 

CC encoded polypeptides (AAM78323-AAM8 0302 ) that exhibit activity elating to 

CC cytokine, cell proliferation or cell differentiation or which may induce 

CC production of other cytokines in other cell populations. The 

CC polynucleotides and polypeptides are useful in gene therapy, vaccines or 

CC peptide therapy. The polypeptides have various cytokine-like activities, 

CC e.g. stem cell growth factor activity, haematopoiesis regulating 

CC activity, tissue growth factor activity, immunomodulatory activity and 

CC activin/inhibin activity and may be useful in the diagnosis and/or 

CC treatment of cancer, leukaemia, nervous system disorders, arthritis and 

CC inflammation. Note: Records for SEQ ID NO 2110 (AAK52581) , 2111 

CC (AAK52582) and 3666 (AAM80020) are omitted as the relevant pages from the 

CC sequence listing were missing at the time of publication 

XX 

SQ Sequence 943 AA; 

Query Match 94.5%; Score 4526.5; DB 4; Length 943; 

Best Local Similarity 91.6%; Pred. No. 0; 

Matches 863; Conservative 2; Mismatches 10; Indels 67; Gaps 4 
Qy 1 MAVRPGLWPALLGIVLAAWLRGSGAQQSATVANPVPGANPDLLPHFLVEPEDVYIVKNKP 60 



Db 



25 MTRRPSL 




Qy 



61 VLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLE 120 




Db 



77 VLLVCKAVPATQI FFKCNGEWVRQVDHVI ERSTDGSSGLPTMEVRINVSRQQVEKVFGLE 136 



Qy 



121 EYWCQCVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAE 180 





Db 



137 EYWCQCVAWSSSGTTKSQKAYIRIAYLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAE 196 



Qy 



181 VEWLRNEDLVT)PSLDPNVYITREHSLWRQARLADTANYTCVAKNIVARRRSASAAVIVY 240 



Db 



I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I 

197 VE W L RN E D LVD P S L D PN VY I T RE H S LWRQ ARLADT AN YT C VAKN I VARRRS AS AAVI VY 256 



Qy 241 VNGGWSTWTEWSVCSASCGRGWQKRSRSCTN 271 

I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 

Db 257 GGPRDSLVTGRGTAVPLGSDMWLSFSVRPVNGGWSTWTEWSVCSASCGRGWQKRSRSCTN 316 

Qy 272 PAPLNGGAFCEGQNVQKTACATLCPVDGSWSPWSKWSACGLDCTHWRSRECSDPAPRNGG 331 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 317 PAPLNGGAFCEGQNVQKTACATLCPVDGSWSPWSKWSACGLDCTHWRSRECSDPAPRNGG 376 

Qy 332 EECQGTDLDTRNCTSDLCVH S AS G P EDVALYVGLI AVAVCLVLLL 376 

Ml I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I 

Db 377 EECQGTDLDTRNCTSDLCVHNSYTPAPTKAMLSPAAASGPEDVALYVGLI AVAVCLVLLL 436 

Qy 377 LVLILVYCRKKEGLDSDVADSSILTSGFQPVSIKPSKADNPHLLTIQPDLSTTTTTYQGS 436 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 437 LVLILVYCRKKEGLDSDVADSSILTSGFQPVSIKPSKADNPHLLTIQPDLSTTTTTYQGS 496 

Qy 437 LCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSSPTSEAEEFVSRLSTQNYFRSLPRGTS 496 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 4 97 LCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSSPTSEAEEFVSRLSTQNYFRSLPRGTS 556 

Qy 497 NMTYGTFNFLGGRLMIPNTGISLLIPPDAIPRGKI YEIYLTLHKPEDVRLPLAGCQTLLS 556 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 557 NMTYGTFNFLGGRLMIPNTGISLLIPPDAIPRGKIYEIYLTLHKPED 603 

Qy 557 PIVSCGPPGVLLTRPVILAMDHCGEPSPDSWSLRLKKQSCEGSWEDVLHLGEEAPSHLYY 616 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 604 — VSCGPPGVLLTRPVILAMDHCGEPSPDSWSLRLKKQSCEGSWEDVLHLGEEAPSHLYY 661 

Qy 617 CQLEASACYVFTEQLGRFALVGEALSVAAAKRLKLLLFAPVACTSLEYNIRVYCLHDTHD 676 

I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
Db 662 CQLEASACYVFTEQLGRFALVGEALSVAAAKRLKLLLFAPVACTSLEYNIRVYCLHDTHD 721 

Qy 677 ALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLWKSKLLVSYQEIPFYHI 736 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 722 ALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLWKSKLLVSYQEIPFYHI 781 

Qy 737 WNGTQRYLHCTFTLERVSPSTSDIACKLWVWQVEGDGQSFSINFNITKDTRFAELIALES 796 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 782 WNGTQRYLHCTFTLERVSPSTSDLACKLWVWQVEGDGQSFSINFNITKDTRFAELLALES 841 

Qy 797 EAGVPALVGP SAFKI P FLI RQKI I S S LDP PCRRGADWRTLAQKLHLDSHLS FFAS KP S PT 856 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 842 EAGVPALVGPSAFKIPFLIRQKIISSLDPPCRRGADWRTLAQKLHLDSHLSFFASKPSPT 901 

Qy 857 AMI LNLWEARH FPNGNLSQLAAAVAGLGQ PDAGL FTVS EAEC 898 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 902 AMI LNLWE1ARH FPNGNLSQLAAAVAGLGQ PDAGL FTVS EAEC 943 



RESULT 8 
AAU74818 

ID AAU74818 standard; protein; 842 AA. 
XX 

AC AAU74818; 



XX 

DT 23-APR-2002 (first entry) 
XX 

DE Human REPTR 1 protein. 
XX 

KW REPTR; human; antiinflammatory; cytostatic; immunosuppressive; antiviral; 

KW anti-HIV; antiarthritic; anticonvulsant; nootropic; neuroprotective; 

KW antiallergic; antibody; immunogen; endometriosis; 

KW gastrointestinal disorder; gastritis; oesophageal carcinoma; 

KW Crohn's disease; irritable bowel syndrome; ulcerative colitis; 

KW endocrine disorder; hypothalamus disorder; Kallman's disease; 

KW autoimmune disease; inflammatory disease; infertility; receptor; 

KW acquired immune deficiency syndrome; AIDS; rheumatoid arthritis; allergy; 

KW osteoarthritis; diabetes mellitus; multiple sclerosis; 

KW systemic lupus erythematosus; cell proliferative disorder; cancer; 

KW developmental disorder; Duchenne muscular dystrophy; 

KW Becker muscular dystrophy; neurological disorder; epilepsy; 

KW Alzheimer's disease; Huntington's disease; reproductive disorder. 

XX 

OS Homo sapiens. 
XX 

PN WO200198354-A2. 
XX 

PD 27-DEC-2001. 
XX 

PF 21-JUN-2001; 2001WO-US019942 . 
XX 

PR 21-JUN-2000; 2000US-0214027P . 

PR 25-AUG-2000; 2000US-022 8 045P . 

PR 12-DEC-2000; 2000US-0255104P . 
XX 

PA (INCY-) INCYTE GENOMICS INC. 
XX 

PI Griffin JA, Kallick DA, Tribouley CM, Yue H, Nguyen DB, Tang YT; 

PI Lai P, Policky JL, Azimzai Y, Lu DAM, Graul R, Yao MG, Burford N; 

PI Hafalia AJA, Baughn MR, Bandman 0, Patterson C, Yang J, Xu Y; 

PI Gandhi AR, Warren BA, Ding L, Sanjanwala MS, Duggan BM, Lu Y; 
XX 

DR WPI; 2002-090432/12. 

DR N-PSDB; ABK15169. 
XX 

PT Twelve human receptors (referred to as REPTR- 1 to REPTR-12), useful in 

PT the diagnosis, treatment and prevention of gastrointestinal (e.g. 

PT gastritis), autoimmune/inflammatory (e.g. osteoarthritis) and cell 

PT proliferative (e.g. cancer ) disorders . 
XX 

PS Claim 45; Page 111-113; 157pp; English. 
XX 

CC This invention relates to twelve human receptors cDNA sequences referred 

CC to as REPTR- 1 to REPTR-12), and the proteins encoded thereby. The 

CC proteins of the invention may have antiinflammatory, cytostatic, 

CC immunosuppressive, antiviral, anti-HIV, antiarthritic, muscular active 

CC general, anticonvulsant, nootropic, neuroprotective, antiallergic 

CC activities . The sequences of the invention may be used to produce REPTR 

CC agonists or antagonists, and the protein sequences may be used to raise 

CC anti-REPTR antibodies. These molecules and the REPTR polynucleotides and 

CC polypeptides of the invention are useful in the diagnosis , treatment and 



CC prevention of gastrointestinal (e.g. gastritis, oesophageal carcinoma, 

CC Crohn's disease, irritable bowel syndrome, ulcerative colitis), endocrine 

CC (e.g. hypothalamus disorder, Kallman f s disease), autoimmune/ inflammatory 

CC (e.g. acquired immune deficiency syndrome (AIDS), rheumatoid arthritis, 

CC allergies, osteoarthritis, diabetes mellitus, multiple sclerosis, 

CC systemic lupus erythematosus), cell proliferative (e.g. cancer), 

CC developmental (e.g. Duchenne and Becker muscular dystrophy), neurological 

CC (e.g. epilepsy, Alzheimer's disease, Huntington's disease) and 

CC reproductive (e.g. infertility, endometriosis) disorders. Numerous other 

CC examples of each disorder are given in the specification. The present 

CC sequence represents the human REPTR1 protein sequence of the invention 

XX 

SQ Sequence 842 AA; 

Query Match 92.1%; Score 4413; DB 5; Length 842; 

Best Local Similarity 93.5%; Pred. No. 0; 

Matches 840; Conservative 1; Mismatches 1; Indels 56; Gaps 1; 

Qy 1 MAVR P GLW PAL LGI VLAAWL RG S GAQQ SAT VAN P VP GAN P D L L P H FL VE P EDVY I VKN K P 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MAVRPGLWPALLGIVLAAWLRGSGAQQSATVANPVPGANPDLLPHFLVEPEDVYIVKNKP 60 

Qy 61 VLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLE 120 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 VLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLE 120 

Qy 121 EYWCQCVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAE 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 
Db 121 EYWCQCVAWSSSGTTKSQKAYIRIAYLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAE 18 0 

Qy 181 VEWLRNEDLVD P S LD PNVYI T REH S LWRQARLADTAN YT CVAKN I VARRRS AS AAVI VY 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 VEWLRNEDLVD P S LD PNVYI T REH S LWRQARLADTAN YT CVAKN I VARRRS AS AAVI VY 240 

Qy 241 VNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGS 300 

I I II 

Db 241 VDGS 244 

Qy 301 WSPWSKWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSASGPEDVA 360 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I 
Db 24 5 WSPWSKWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHTASGPEDVA 304 

Qy 361 LYVGLI AVAVCLVLLLLVLI LVYCRKKEGLDSDVADS S I LT S GFQPVS I KPSKADNPHLL 420 

I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 305 LYVGLI AVAVCLVLLLLVLI LVYCRKKEGLDS DVADS S I LTS GFQPVS I KP S KADNPHLL 364 

Qy 421 TIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSSPTSEAEEFVS 480 

I I I I I I I I I M I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 365 TIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSSPTSEAEEFVS 424 

Qy 481 RLSTQNYFRSLPRGTSNMTYGTFNFLGGRLMI PNTGISLLIPPDAIPRGKIYEIYLTLHK 540 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 425 RLSTQNYFRSLPRGTSNMTYGTFNFLGGRLMI PNTGISLLIPPDAIPRGKIYEIYLTLHK 484 



Qy 541 
Db 485 



PEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVILAMDHCGEPSPDSWSLRLKKQSCEGSW 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
PEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVILAMDHCGEPSPDSWSLRLKKQSCEGSW 



600 
544 



Qy 


601 


Db 


545 


Qy 


661 


Db 


605 


Qy 


721 


Db 


665 


Qy 


781 


Db 


725 


Qy 


841 


Db 


785 



EDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALSVAAAKRLKLLLFAPVACT 660 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
EDVLHLGEEAPSHLYYCQLEASACWFTEQLGRFALVGEALSVTVAAKRLKLLLFAPVACT 604 

SLEYNIRVYCLHDTHDTU.KEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLW 72 0 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
SLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLW 664 

KSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWVWQVEGDGQSFSINF 78 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

KSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWVWQVEGDGQSFSINF 724 

NITKDTRF7VELIALESEAGVPALVGPSAFKIPFLIRQKIISSLDPPCRRGADWRTLAQKL 84 0 

I I I I I I I I I I I I I I I i I I ! I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I | I I I I I I 

NITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISSLDPPCRRGADWRTLAQKL 784 

HLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVSEAEC 8 98 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
HLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVSEAEC 842 



RESULT 9 
AAW78899 



ID AAW78899 standard; protein; 556 AA. 
XX 

AC AAW78899; 
XX 

DT 25-MAR-2003 (revised) 

DT 21-DEC-1998 (first entry) 

XX 

DE Human UNC-5 homologue UNC5H-1. 
XX 

KW UNC-5; UNC5H-1; human; netrin receptor; cell migration; axon guidance; 

KW diagnosis; therapy. 

XX 

OS Homo sapiens . 
XX 

FH Key Location/Qualifiers 

FT Misc-dif f erence 7 

FT /note= "encoded by TG" 

FT Misc-dif ference 67 

FT /note= "encoded by ATCT" 

FT Misc-dif ference 256 

FT /note= "encoded by GC" 

FT Misc-dif ference 262 

FT /note= "encoded by TG" 

FT Misc-dif ference 360 

FT /note= "encoded by AG" 

FT Misc-dif ference 367 

FT /note= "encoded by CC" 

FT Misc-dif ference 370 

FT /note= "encoded by TC" 

FT Misc-dif ference 542 

FT /note= "encoded by GG" 
XX 

PN WO9837085-A1. 



XX 

PD 27-AUG-1998. 
XX 

PF 19-FEB-1998; 98WO-US003143 . 
XX 

PR 19-FEB-1997; 97US-00808982 . 
XX 

PA (REGC ) UNIV CALIFORNIA. 
XX 

PI Tessier-Lavigne M, Leonardo ED, Hinck L, Masu M, Keinomasu K; 
XX 

DR WPI; 1998-495364/42. 

DR N-PSDB; AAW78899. 
XX 

PT Netrin-binding, vertebrate proteins - useful for diagnosis, therapy and 

PT the biopharmaceutical industry. 

XX 

PS Claim 1; Page 22-23; 32pp; English. 
XX 

CC UNC5H-1 and UNC5H-2 (see AAW78901) are human homologues of Caenorhabditis 

CC elegans UNC-5 protein. Their amino acid sequences were deduced from 

CC isolated uncSh cDNA clones (see AAV52941 and AAV52943) isolated from an 

CC embryonic brain cDNA library. The predicted proteins show similarity with 

CC UNC-5, possess 2 predicted Ig-like domains and 2 predicted thrombospondin 

CC type-1 repeats, a predicted membrane spanning region, and a large 

CC intracellular domain. They are predicted to be involved in cell migration 

CC and axon guidance, and are characterised as receptor proteins for 

CC netrins. Rat UNC5H-1 (see AAW78898) and UNC5H-2 (see AAW78900) proteins 

CC are also claimed. Vertebrate UNC-5 proteins may be produced recombinantly 

CC from transfected host cells. The invention also provides unc-5 

CC hybridisation probes and primers, vertebrate UNC-5-specif ic binding 

CC agents such as specific antibodies, and methods of making and using the 

CC subject compositions in diagnosis (e.g. genetic hybridisation screens for 

CC vertebrate unc-5 transcripts), therapy (e.g. gene therapy to modulate 

CC vertebrate unc-5 gene expression) and in the biopharmaceutical industry 

CC (e.g. as immunogens, reagents for modulating cell guidance, reagents for 

CC screening chemical libraries for lead pharmacological agents, etc.). 

CC (Updated on 25-MAR-2003 to correct PI field.) 

XX 

SQ Sequence 556 AA; 

Query Match 58.8%; Score 2815; DB 2; Length 556; 
Best Local Similarity 96.9%; Pred. No. 1.2e-225; 

Matches 539; Conservative 1; Mismatches 16; Indels 0; Gaps 0; 

Qy 343 N C T S D L C VH S AS G P E D VAL YVG L I AVAVC L VL L L L VL I L VY CRKKEGLDS D VAD S S I LT S 402 

I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 NCTSDLXVHTASGPEDV7VLYVGLIAVAVCLVLLLLVLILVYCRKKEGLDSDVADSSILTS 60 

Qy 403 GFQPVSIKPSKADNPHLLTIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGG 462 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 61 GFQPVSIKPSKADNPHLLTIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGG 12 0 



Qy 463 RHTLHHSSPTSEAEEFVS RLSTQNYFRSLPRGTSNMTYGTFNFLGGRLMI PNTGI SLLI P 522 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 

Db 121 RHTLHHSSPTSEAEEFVSRLSTQNYFRSLPRGTSNMTYGTFNFLGGRLMI PNTGI SLLI P 180 



Qy 

Db 



523 
181 



PDAIPRGKIYEIYLTLHKPEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVILAMDHCGEP 582 

I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M II I I I I II I I I I I I I I I I I 

PDAIPRGKIYEIYLTLHKPEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVI LAMDHCGEP 240 



Qy 583 SPDSWSLRLKKQSCEGSWEDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALS 642 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 
Db 241 SPDSWSLALKKQSCEGSWEDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALS 300 

Qy 643 VAAAKRLKLLLFAPVACTSLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFK 702 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | 
Db 301 VAAAKRLKLLLFAPVACTSLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHLX 360 

Qy 703 DSYHNLRLSIHDVPSSLWKSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLAC 762 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | | | | | | I I 

Db 361 DSYHNLXLSXHDVPSSLWKSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLAC 420 

Qy 763 KLWVWQVEGDGQSFSINFNITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISS 822 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 421 KLWWQVEGDGQSFSINFNITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISS 480 

Qy 823 LDPPCRRGADWRTLAQKLHLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAG 882 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 481 LDPPCRRGADWRTIAQKLHLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQIJWVVAG 540 

Qy 883 LGQPDAGLFTVSEAEC 898 

I I I I I I 

Db 541 TPAGRWLLSQCSEAEC 556 



RESULT 10 


AAB50691 


ID 


AAB50691 standard; protein; 931 AA. 


XX 




AC 


AAB50691; 


XX 




DT 


19-MAR-2001 (first entry) 


XX 




DE 


Human UNC5C protein SEQ ID NO: 90. 


XX 




KW 


Human; Caenorhabditis elegans; UNC-5; splice variant; nematode worm; 


KW 


protein-protein interaction; identification. 


XX 




OS 


Homo sapiens. 


XX 




PN 


WO200073328-A2. 


XX 




PD 


07-DEC-2000. 


XX 




PF 


02-JUN-2000; 2000WO-EP005108 . 


XX 




PR 


01-JUN-1999; 99GB-00012755 . 


XX 




PA 


(DEVG-) DEVGEN NV. 


XX 




PI 


Van Criekinge W, Roelens I , Bogaert T, Verwaerde P; 


XX 




DR 


WPI; 2001-016508/02. 



XX 

PT Three variants of human unc-5C cDNAs (unc-5Cb, unc-5Cc and unc-5C8) and a 

PT human unc-5HSl cDNA, useful in yeast two hybrid experiments for 

PT identifying unknown human cDNAs which encode proteins that interact with 

PT the human unc-5C protein. 

XX 

PS Disclosure; Page 224-227; 246pp; English. 
XX 

CC The present invention describes 3 variants of human unc-5C cDNAs (unc- 

CC 5Cb, unc-5Cc and unc-5C8) which correspond to alternatively spliced unc- 

CC 5C transcripts , and a human unc-5HSl cDNA which shares homology with the 

CC Rattus norvegicus unc-5HSl cDNA. Also described are assays based on 

CC protein-protein-interactions between the unc-5 protein and a variety of 

CC different interacting proteins. The unc-5C variant cDNAs and unc-5HSl 

CC cDNA are useful in methods for identifying compounds which reduce or 

CC inhibit the lethal phenotype associated with the expression of the unc-5 

CC death domain in yeast. They are also useful in yeast two hybrid 

CC experiments for identifying unknown human cDNAs which encode proteins 

CC that interact with the human unc-5C protein. AAC90914 to AAC90971 and 

CC AAB50646 to AAB50693 represent sequences used in the exemplification of 

CC the present invention 

XX 

SQ Sequence 931 AA; 



Query Match 57.5%; Score 2755; DB 4; Length 931; 

Best Local Similarity 56.4%; Pred. No. 2.8e-220; 

Matches 514; Conservative 154; Mismatches 215; Indels 28; Gaps 9; 

Qy 9 PALLGIVLAAWLRGSGAQQS ATVANPVPGANPDLLPHFLVEPEDVYIVKNKPVLLVC 65 

III : I : I I I I I : I I : I I I I I : I I I : I I I I I I I I I I 

Db 26 PAL — ALLSASGTGSAAQDDDFFHELPETFPSDPPEPLPHFLIEPEEAYIVKNKPVNLYC 83 

Qy 66 KAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLEEYWCQ 125 

II llllhllll I I I I II::: I : I I I II I : I I I I I I :: I I I : I I I I 
Db 84 KASPATQIYFKCNSEWVHQKDHIVDERVDETSGLIVREVSIEISRQQVEELFGPEDYWCQ 143 

Qy 126 CVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAEVEWLR 185 

M M I I : I I I I I : I I I : I I I III I I I I I I I I I I I I I ::| MINIM MINI: 
Db 144 CVAWS SAGTTKS RKAYVRI AYLRKT FEQEP LGKEVS LEQEVLLQCRP PEGI PVAEVEWLK 203 

Qy 186 NEDLVDPSLDPNWITREHSLWRQARLADTANYTCVAKNIVARRRSASAAVIVTWGGW 245 

I I I :: I I I I Ml : I : I : :: I II I : I I II I I I I I I I I I I : I : I : I I I II I I II I 
Db 204 NEDIIDPVEDRNFYITIDHNLIIKQARLSDTANYTCVAKNIV/lKRKSTTATVIVYVNGGW 263 

Qy 246 STWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGSWSPWS 305 

I I I I I I I I I : : I I I I : I II : I : I I I I I I I M I I I I I I I : I II II I I I I I I I Mill 
Db 264 STWTEWSVCNSRCGRGYQKRTRTCTNPAPLNGGAFCEGQSVQKIACTTLCPVDGRWTPWS 323 

Qy 306 KWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSASGPEDVALYVGL 365 

I I I I I : I I I I I I I I : I I I : II I :: I I I : : I I I I M : I : I I I I I I M 
Db 324 KWSTCGTECTHWRRRECTAPAPKNGGKDCDGLVLQSKNCTDGLCMQTAPDSDDVALYVGI 383 

Qy 366 -IAVAVCLVLLLLVLILVYCRKKEGLDSDVADSSILTSGFQPVSIKPSKADNPHLLTIQP 424 

I I I I I I : : : I : I I : : | M I II I I I II I : I I : : I I I : I 
Db 384 VI AVI VCLAI SVWAL FVYRKNHRDFESDI I DS SALNGGFQPVN I KAARQD LLAVPP 440 



Qy 



425 DLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHS SPTSEAEEFVS 480 



II:: I : I : I II : I I : I I I : : : : : II : III 
Db 441 DLTS7\7^MYRGPVYALHD-VSDKIPMTNSPILDPLPNLKIKVYNTSGAVSPQDDLSEFTS 499 

Qy 481 RLS TQNYF RSLPRGT — SNMTYGTFNFLGGRLMIPNTGISLLI PPDAI 526 

: I I M: : I I I I I : I : I I I I I I : : I I : I : I I I I I II 

Db 500 KLSPQMTQSLLENEALSLKNQSLARQTDPSCTAFGSFNSLGGHLIVPNSGVSLLIPAGAI 559 

Qy 527 PRGKIYEIYLTLHKPEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVI LAMDHCGEPSPDS 586 

I : I : : I I : I : I : I : | : | | : I I I I : I : I I I I I I I I I I I I I : I I II : I : : 
Db 560 PQGRVYEMYVTVHRKETMRPPMDDSQTLLTPWSCGPPGALLTRPVVLTMHHCADPNTED 619 

Qy 587 WSLRLKKQSCEGSWEDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALSVAAA 646 

I : I I I : : I I I I I : : I I I : I : I : I II*- II I : I I I I : : I I I 
Db 620 WKILLKNQAAQGQWEDWWGEENFTTPCYIKLDAEACHILTENLSTYALVGHSTTKAAA 67 9 

Qy 647 KRLKLLLFAPVACTSLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYH 706 

I I I I I : I I : I : I II I : I I I I I I II Mill:: I I : I | | | | : : | | : j | | | | | 
Db 680 KRLKLAIFGPLCCSSLEYSIRVYCLDDTQDALKEILHLERQTGGQLLEEPKALHFKGSTH 739 

Qy 707 NLRLSIHDVPSSLWKSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWV 766 

I I I I I I I I : I I I I I I I I I I I I I I I I : I : I : I I I I I I I I I I I I : I : I III I 
Db 74 0 NLRLSIHDIAHSLWKSKLLAKYQEIPFYHVWSGSQRNLHCTFTLERFSLNTVELVCKLCV 799 

Qy 767 WQVEGDGQSFSINFNITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISSLDPP 826 

I I I I : I I I : I : : : : : I I : : : I I I I I I I I I I I : I I I I I 

Db 800 RQVEGEGQIFQLNCTVSEEPTGIDLPLLDPANTITTVTGPSAFSIPLPIRQKLCSSLDAP 859 

Qy 827 CRRGADWRTLAQKLHLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAGLGQP 886 

II III II I I : I I : I : : I I : I III : I I : I I I I : : I I : I I I I III : : I : 
Db 860 QTRGHDWRMLAHKLNLDRYLNYFATKSSPTGVILDLWEAQNFPDGNLSMLAAVLEEMGRH 919 

Qy 8 87 DAGLFTVSEAE 8 97 

: : : | : 
Db 92 0 ETWS LAAEGQ 930 



RESULT 11 
ADE63098 

ID ADE63098 standard; protein; 931 AA. 
XX 

AC ADE63098; 
XX 

DT 29-JAN-2004 (first entry) 
XX 

DE Human Protein AAC67491, SEQ ID NO 9033. 
XX 

KW Human; pain; neuronal tissue; gene therapy; 

KW spinal segmental nerve injury; chronic constriction injury; CCI; 

KW spared nerve injury; SNI; Chung. 

XX 

OS Homo sapiens. 
XX 

PN WO2003016475-A2. 
XX 

PD 27-FEB-2003. 
XX 

PF 14-AUG-2002; 2002WO-US025765 . 



XX 

PR 14-AUG-2001; 2001US-0312147P . 

PR Ol-NOV-2001; 2001US-034 6382P . 

PR 26-NOV-2001; 2001US-0333347P . 
XX 

PA (GEHO ) GEN HOSPITAL CORP. 

PA (FARB ) BAYER AG. 

XX 

PI Woolf C, D'urso D, Befort K, Costigan M; 
XX 

DR WPI; 2003-268312/26. 

DR GENBANK; AAC67491. 
XX 

PT New composition comprising two or more isolated polypeptides, useful for 

PT preparing a medicament for treating pain in an animal. 

XX 

PS Claim 1; Page; 1017pp; English. 
XX 

CC The invention discloses a composition comprising two or more isolated rat 

CC or human polynucleotides or a polynucleotide which represents a fragment, 

CC derivative or allelic variation of the nucleic acid sequence. Also 

CC claimed are a vector comprising the novel polynucleotide, a host cell 

CC comprising the vector, a method for identifying a nucleotide sequence 

CC which is differentially regulated in an animal subjected to pain and a 

CC kit to perform the method, an array, a method for identifying an agent 

CC that increases or decreases the expression of the polynucleotide sequence 

CC that is differentially expressed in neuronal tissue of a first animal 

CC subjected to pain, a method for identifying a compound which regulates 

CC the expression of a polynucleotide sequence which is differentially 

CC expressed in an animal subjected to pain, a method for identifying a 

CC compound that regulates the activity of one or more of the 

CC polynucleotides, a method for producing a pharmaceutical composition, a 

CC method for identifying a compound or small molecule that regulates the 

CC activity in an animal of one or more of the polypeptides given in the 

CC specification, a method for identifying a compound useful in treating 

CC pain and a pharmaceutical composition comprising the one or more 

CC polypeptides or their antibodies. The polynucleotide or the compound that 

CC modulates its activity is useful for preparing a medicament for treating 

CC pain (e.g. spinal segmental nerve injury (Chung), chronic constriction 

CC ■ injury (CCI) and spared nerve injury (SNI)) in an animal (e.g. gene 

CC therapy) . The sequence presented is a human protein (shown in Table 2 of 

CC the specification) which is differentially expressed during pain. Note: 

CC The sequence data for this patent did not form part of the printed 

CC specification, but was obtained in electronic form directly from WIPO at 

CC ftp. wipo . int/pub/published_pct_sequences . 

XX 

SQ Sequence 931 AA; 

Query Match 57.5%; Score 2755; DB 7; Length 931; 
Best Local Similarity 56.4%; Pred. No. 2.8e-220; 

Matches 514; Conservative 154; Mismatches 215; Indels 28; Gaps 9; 

Qy 9 PALLGIVLAAWLRGS GAQQS AT VAN P VP GAN P D L L P H FLVE P E DVY I VKN K P VLLVC 65 

III : I : I I I I I : I I : I I I I I : I I I : I I I I I I I I I I 

Db 26 PAL— ALLSASGTGSAAQDDDFFHELPETFPSDPPEPLPHFLIEPEEAYIVKNKPVNLYC 83 



Qy 66 KAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLEEYWCQ 12 5 



II I I I I I : II I I III I II::: I : I I I II I : I I I I I I : : I I I : I I I I 
Db 84 KASPATQIYFKCNSEWVHQKDHIVDERVDETSGLIVREVSIEISRQQVEELFGPEDYWCQ 143 



Qy 126 CVAWSSSGTTKSQPCAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAEVEWLR 185 

I I I I I I : I I I I I : I II : I I I III I I I I I I I I I I I I I : : I I I I I I I I I MINI: 
Db 144 CVAWSSAGTTKSRKAYVRIAYLRKTFEQEPLGKEVSLEQEVLLQCRPPEGIPVAEVEWLK 203 

Qy 186 NEDLVDPSLDPNVYITREHSLWRQARI^TANYTCVAKNIVARRRSASAAVIVYVNGGW 245 

I I I : : I I I I III : I : I : : : I I I I : I I I I I I I I I I I I I I : I : I : I I I I I I I I I I 

Db 204 NEDIIDPVEDRNFYITIDHNLIIKQARLSDTANYTCVAKNIVAKRKSTTATVIVYVNGGW 263 

Qy 246 STWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGSWSPWS 305 

MINIMI:: II II M I M M M M II II I M M II I M II II I I M I I I hill 
Db 2 64 STWTEWSVCNSRCGRGYQKRTRTCTNPAPLNGGAFCEGQSVQKIACTTLCPVDGRWTPWS 323 

Qy 306 KWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSASGPEDVALYVGL 365 

II I II M M II III: M I : I I I : : I I I : : I I I I I : : I : I I I I I I I : 

Db 324 KWSTCGTECTHWRRRECTAPAPKNGGKDCDGLVLQSKNCTDGLCMQTAPDSDDVALYVGI 38 3 

Qy 366 -IAVAVCLVLLLLVLILVYCRKKEGLDSDVADSSILTSGFQPVSIKPSKADNPHLLTIQP 42 4 

I I I I I I : : : I : I I : : I I : I I I I I I I I I : I I : : I I I : I 
Db 384 VIAVIVCLAISVWALFVYRKNHRDFESDIIDSSALNGGFQPVNIKAARQD LLAVPP 440 

Qy - 425 DLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHS SPTSEAEEFVS 480 

II:: I : I : I II : I I MM : : : : : II : III 

Db 441 DLTSAAAMYRGPVYALHD-VSDKIPMTNSPILDPLPNLKIKVYNTSGAVSPQDDLSEFTS 499 

Qy 481 RLS TQNYF RSLPRGT — SNMTYGTFNFLGGRLMIPNTGISLLIPPDAI 526 

: I I II: Mill I : I : I I I I I I : : I I : I : I I I I I II 

Db 500 KLSPQMTQSLLENEALSLKNQSLARQTDPSCTAFGSFNSLGGHLIVPNSGVSLLIPAGAI 559 

Qy 527 PRGKIYEIYLTLHKPEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVI LAMDHCGEPSPDS 58 6 

I : I : : I I : I : I : I : I : I I : I I I I : I : I I I I I I I I I I I I I : I I II : I : : 
Db 560 PQGRVYEMYVTVHRKETMRPPMDDSQTLLTPWSCGPPGALLTRPWLTMHHCADPNTED 619 

Qy 587 WSLRLKKQSCEGSWEDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALSVAAA 64 6 

I : I I I : : I I I I I : : I I I : I MM I I : : II I Mill : : III 
Db 620 WKILLKNQi\AQGQWEDVVVVGEENFTTPCYIKLDAEACHILTENLSTYALVGHSTTKAAA 67 9 

Qy 647 KRLKLLLFAPVACTSLEYNIRVYCLHDTHD7VLKEWQLEKQLGGQLIQEPRVLHFKDSYH 706 

I I I I I : I I : I M I I I M I I I I I II Mill:: MM I I I I :: I I : I I I I I I 
Db 680 KRLKLAI FGPLCCSSLEYSIRVYCLDDTQDALKEILHLERQTGGQLLEEPKALHFKGSTH 739 

Qy 7 07 NLRLSIHDVPSSLWKSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWV 7 66 

I I I I I I I I : I I I I I I I I II I I I I II M M M I I I I I I I I II I M M III I 
Db 740 NLRLSIHDIAHSLWKSKLLAKYQEIPFYHVWSGSQRNLHCTFTLERFSLNTVELVCKLCV 799 

Qy 767 WQVEGDGQSFSINFNITKDTRF7VELLALESEAGVPALVGPSAFKIPFLIRQKIISSLDPP 826 

I I I I M I I M : : : : M I : : : I I I I I I I I I M : I I I I I 

Db 800 RQVEGEGQIFQLNCTVSEEPTGIDLPLLDPANTITTVTGPSAFSIPLPIRQKLCSSLDAP 859 

Qy 827 CRRGADWRTLAQKLHLDSHLSFFASKPSPTAMILNLWEAJ^FPNGNLSQLAAAVAGLGQP 886 

II III II I I M I : I : M I : I III : I I : M I I : M I M I M III : : I : 
Db 860 QTRGHDWRMLAHKLNLDRYLNYFATKSSPTGVILDLWEAQNFPDGNLSMLT^AVLEEMGRH 919 

Qy 887 DAGLFTVSEAE 897 

: : : I : 



Db 920 ETVVS LAAEGQ 930 



RESULT 12 
ABG11551 

ID ABG11551 standard; protein; 982 AA. 
XX 

AC ABG11551; 
XX 

DT 18-FEB-2002 (first entry) 
XX 

DE Novel human diagnostic protein #11542. 
XX 

KW Human; chromosome mapping; gene mapping; gene therapy; forensic; 

KW food supplement; medical imaging; diagnostic; genetic disorder. 
XX 

OS Homo sapiens. 
XX 

PN WO200175067-A2 . 
XX 

PD ll-OCT-2001. 
XX 

PF 30-MAR-2001; 2001WO-US008631 . 
XX 

PR 31-MAR-2000; 2000US-0054 02 17 . 

PR 23-AUG-2000; 2000US-0064 9167 . 
XX 

PA (HYSE-) HYSEQ INC. 
XX 

PI Drmanac RT, Liu C, Tang YT; 
XX 

DR WPI; 2001-639362/73. 

DR N-PSDB; AAS75738. 
XX 

PT New isolated polynucleotide and encoded polypeptides, useful in 

PT diagnostics , forensics, gene mapping, identification of mutations 

PT responsible for genetic disorders or other traits and to assess 

PT biodiversity. 
XX 

PS Claim 20; SEQ ID NO 41910; 103pp; English. 
XX 

CC The invention relates to isolated polynucleotide (I) and polypeptide (II) 

CC sequences. (I) is useful as hybridisation probes, polymerase chain 

CC reaction (PCR) primers, oligomers, and for chromosome and gene mapping, 

CC and in recombinant production of (II) . The polynucleotides are also used 

CC in diagnostics as expressed sequence tags for identifying expressed 

CC genes. (I) is useful in gene therapy techniques to restore normal 

CC activity of (II) or to treat disease states involving (II) . (II) is 

CC useful for generating antibodies against it, detecting or quantitating a 

CC polypeptide in tissue, as molecular weight markers and as a food 

CC supplement. (II) and its binding partners are useful in medical imaging 

CC of sites expressing (II) . (I) and (II) are useful for treating disorders 

CC involving aberrant protein expression or biological activity. The 

CC polypeptide and polynucleotide sequences have applications in 

CC diagnostics, forensics, gene mapping, identification of mutations 

CC responsible for genetic disorders or other traits to assess biodiversity 

CC and to produce other types of data and products dependent on DNA and 



CC amino acid sequences. ABG00010-ABG30377 represent novel human diagnostic 

CC amino acid sequences of the invention. Note: The sequence data for this 

CC patent did not appear in the printed specification, but was obtained in 

CC electronic format directly from WIPO at 

CC ftp . wipo . int/pub/published_pct_sequences 
XX 

SQ Sequence 982 AA; 

Query Match 57.5%; Score 2755; DB 4; Length 982; 

Best Local Similarity 56.4%; Pred. No. 3e-220; 

Matches 514; Conservative 154; Mismatches 215; Indels 28; Gaps 9; 

Qy 9 PALLGI VLAAWLRGSGAQQS ATVANPVPGANPDLLPHFLVEPEDVYIVKNKPVLLVC 65 

III : I : I MM : I I : II I II : M I : M I I I I I I I I 

Db 77 PAL--ALLSASGTGSAAQDDDFFHELPETFPSDPPEPLPHFLIEPEEAYIVKNKPVNLYC 134 

Qy 66 KAVPATQIFFKCNGEWRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLEEYWCQ 125 

II II I I I : I I I I III I II::: I MM II I : I I I I I I : : I I I M I I I 
Db 135 KASPATQIYFKCNSEWVHQKDHIVDERVDETSGLIVREVSIEISRQQVEELFGPEDYWCQ 194 

Qy 126 CVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAEVEWLR 185 

I I I I I I : I M I I : I I I : II I III I I I II I II I II II :: I I I I II II I II II II : 

Db 195 CVAWSSAGTTKSRKAYVRIAYLRKTFEQEPLGKEVSLEQEVLLQCRPPEGIPVAEVEWLK 254 

Qy 18 6 NEDLVT)PSLDPNWITREHSLWRQARI^TANYTCVAKNIVARRRSASAAVIvTWGG^ 24 5 

M I : : I I I I III : I : I : : M M I : M M I I M I I I M I : I : I M I I I I I II I I 
Db 255 NED 1 1 DPVEDRNFYITI DHNLI I KQARLSDT7\NYTCVAKNIV7\J<RKSTTATVI VYVNGGW 314 

Qy 246 STWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGSWSPWS 305 

II M I II I I : : II I I M II : I : I I II II II II I II II I M II II II I II II Mill 

Db 315 STWTEWSVCNSRCGRGYQKRTRTCTNPAPLNGGAFCEGQSVQKIACTTLCPVDGRWTPWS 374 

Qy 306 KWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSASGPEDVALYVGL 365 

III II : I II II MM I M : II I : : I I I : : M I I I : : I : I I I I I I I : 

Db 375 KWSTCGTECTHWRRRECTAPAPKNGGKDCDGLVLQSKNCTDGLCMQTAPDSDDV7VLYVGI 434 

Qy 366 -IAVAVCLVLLLLVLILVYCRKKEGLDSDVADSSILTSGFQPVSIKPSKADNPHLLTIQP 424 

I II II I : : : I : I I : : I I : M I I II M I : I I : : I I I : I 
Db 435 VI AVI VCLAI SVWALFVYRKNHRDFESDI I DS SALNGGFQPVNI KAARQD LLAVPP 491 

Qy 425 DLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHS SPTSEAEEFVS 480 

II:: Ml: I II Ml MM : : : : : M : III 
Db 492 DLTSAAAMYRGPVYALHD-VSDKIPMTNSPILDPLPNLKIKVYNTSGAVSPQDDLSEFTS 550 

Qy 4 81 RLS TQNYF RSLPRGT — SNMTYGTFNFLGGRLMI PNTGI SLLI PPDAI 52 6 

Ml II: Mill I M : M I I I I :: I I M : I II I I II 

Db 551 KLSPQMTQSLLENEALSLKNQSLARQTDPSCTAFGSFNSLGGHLIVPNSGVSLLIPAGAI 610 

Qy 527 PRGKIYEIYLTLHKPEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVI LAMDHCGEPSPDS 586 

I M : : I I M M : I : I M |: MMMMIIMM I I I I I I : I I II :|: : 
Db 611 PQGRVYEMYVTVHRKETMRPPMDDSQTLLTPVVSCGPPGALLTRPVVLTMHHCADPNTED 670 

Qy 587 WSLRLKKQSCEGSWEDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALSVAAA 646 

I : I I I : M I I II : M I I : I : I : I I I : : II I : I I I I : : II I 
Db 671 WKILLKNQ7\AQGQWEDVVVVGEENFTTPCYIKLDAEACHILTENLSTYALVGHSTTKAAA 730 



Qy 



647 KRLKLLLFAPVACTSLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYH 706 



Db 



I I I I I : I I : hlllhllllll II INN:: | | : | | | | | : : | | : Mil I I 

731 KRLKLAIFGPLCCSSLEYSIRVYCLDDTQDALKEILHLERQTGGQLLEEPKALHFKGSTH 790 



Qy 



Db 



707 NLRLSIHDVPSSLWKSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWV 766 

I II I II II : M II II I I I II I M II : I : I : II I I M I I I I I I : I : I III I 

791 NLRLSIHDIAHSLWKSKLLAKYQEIPFYHVWSGSQRNLHCTFTLERFSLNTVELVCKLCV 850 



QY 



Db 



767 WQVEGDGQSFSINFNITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISSLDPP 826 

M I I : I I I : I : : : : : I I : : : Mill II M M : I I I I I 

851 RQVEGEGQIFQLNCTVSEEPTGIDLPLLDPANTITTVTGPSAFSIPLPIRQKLCSSLDAP 910 



Qy 



827 CRRGADWRTLAQKLHLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAGLGQP 886 
II Ml II IIMI : I : : I I : I III : II : I II I : : M : I I I I Ml : : I : 




Db 



Qy 



Db 



887 DAGLFTVSEAE 897 
971 ETWSLAAEGQ 981 



RESULT 13 
ADE63096 

ID ADE63096 standard; protein; 945 AA. 
XX 

AC ADE63096; 
XX 

DT 29-JAN-2004 (first entry) 
XX 

DE Rat Protein AAB57679, SEQ ID NO 9031. 
XX 

KW Rat; pain; neuronal tissue; gene therapy; spinal segmental nerve injury; 

KW chronic constriction injury; CCI; spared nerve injury; SNI; Chung. 

XX 

OS Rattus norvegicus. 
XX 

PN WO2003016475-A2. 
XX 

PD 27-FEB-2003. 
XX 

PF 14-AUG-2002; 2002WO-US025765 . 
XX 

PR 14-AUG-2001; 2001US-0312147P . 

PR 01-NOV-2001; 2001US-0346382P . 

PR 26-NOV-2001; 2001US-0333347P . 
XX 

PA (GEHO ) GEN HOSPITAL CORP. 

PA (FARB ) BAYER AG. 

XX 

PI Woolf C, D'urso D, Befort K, Costigan M; 
XX 

DR WPI; 2003-268312/26. 

DR GENBANK; AAB57 67 9. 
XX 

PT New composition comprising two or more isolated polypeptides,- useful for 

PT preparing a medicament for treating pain in an animal . 

XX 

PS Claim 1; Page; 1017pp; English. 



XX 



CC The invention discloses a composition comprising two or more isolated rat 

CC or human polynucleotides or a polynucleotide which represents a fragment, 

CC derivative or allelic variation of the nucleic acid sequence. Also 

CC claimed are a vector comprising the novel polynucleotide, a host cell 

CC comprising the vector, a method for identifying a nucleotide sequence 

CC which is differentially regulated in an animal subjected to pain and a 

CC kit to perform the method, an array, a method for identifying an agent 

CC that increases or decreases the expression of the polynucleotide sequence 

CC that is differentially expressed in neuronal tissue of a first animal 

CC subjected to pain, a method for identifying a compound which regulates 

CC the expression of a polynucleotide sequence which is differentially 

CC expressed in an animal subjected to pain, a method for identifying a 

CC compound that regulates the activity of one or more of the 

CC polynucleotides, a method for producing a pharmaceutical composition, a 

CC method for identifying a compound or small molecule that regulates the 

CC activity in an animal of one or more of the polypeptides given in the 

CC specification, a method for identifying a compound useful in treating 

CC pain and a pharmaceutical composition comprising the one or more 

CC polypeptides or their antibodies. The polynucleotide or the compound that 

CC modulates its activity is useful for preparing a medicament for treating 

CC pain (e.g. spinal segmental nerve injury (Chung), chronic constriction 

CC injury (CCD and spared nerve injury (SNI)) in an animal (e.g. gene 

CC therapy) . The sequence presented is a rat protein (shown in Table 2 of 

CC the specification) which is differentially expressed during pain. Note: 

CC The sequence data for this patent did not form part of the printed 

CC specification, but was obtained in electronic form directly from WIPO at 

CC ftp . wipo . int/pub/published_pct_sequences . 
XX 

SQ Sequence 945 AA; 

Query Match 53.8%; Score 2578.5; DB 7; Length 945; 

Best Local Similarity 53.0%; Pred. No. 1.5e-205; 

Matches 509; Conservative 142; Mismatches 231; Indels 79; Gaps 17; 

Qy 1 MAVRPGLWPALLGIVLAAW LRG — S GAQQ SAT VAN P VP GAN P DLL PH FLVE P E DV 53 

I II III : I I II III : : II : I I I I I : I I I I 

Db 1 MRARSGARGALLLALLLCWDPTPSLAGI DSGGQ ALPDSFPSAPAEQLPHFLLEPEDA 57 

Qy 54 YIVKNKPVLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQV 113 



Db 



58 YIVKNKPVELHCRAFPATQIYFKCNGEWVSQKGHVTQESLDEATGLRIREVQIEVSRQQV 117 



Qy 



114 EKVFGLEEYWCQCVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPP 173 



Db 




Qy 



174 EGIPPAEVEWLRNEDLVDPSLDPNVYITREHSLWRQARLADTT^YTCVAKNIVARRRSA 233 





Db 



17 8 EGVPVAEVEWLKNEDVIDPAQDTNFLLTIDHNLIIRQARLSDTANYTCVAKNIVAKRRST 237 



Qy 



234 SAAVIVYVNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACAT 293 





Db 



238 TATVIVYVNGGWSSWAEWSPCSNRCGRGWQKRTRTCTNPAPLNGGAFCEGQACQKTACTT 297 



Qy 294 LCPVDGSWSPWSKWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCV 350 

: I I I I I : I : I I I I I I I : I I I I I I I I I hill : I II I I : : I I I I II 



Db 298 VCPVDGAWTEWSKWSACSTEC7VHWRSRECMAPPPQNGGRDCSGTLLDSKNCTDGLCVLNQ 357 

Qy 351 HSASGPE DVALYVGL- 1 AVAVCLVLLLLVL I LVYCRKKEGLDS DVADS S - 1 L 400 

: : I : I I I I I I I : I I I I : I : I : : I I I I : I : I I I I 

Db 358 RTLNDPKSRPLEPSGDVALYAGLVVAVFWLAVLMAVGVIVYRRNCRDFDTDITDSSAAL 417 

Qy 401 TSGFQPVSIKPSKADNPHLL — TIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSP 458 

I I I I I : I : : I I I I : III:: I : I : II : I : I I III 

Db 418 TGGFHPVNFKTARPSNPQLLHPSAPPDLTASAGIYRGPVY7VLQDS-ADKIPMTNSPLLDP 47 6 

Qy 459 L GGG RHTLHHSSPTSEAEEFVS 480 

I II I II I : 
Db 477 LPSLKIKVYDSSTIGSGAGLADGADLLGVLPPGTYPGDFSRDTHFLHLRS A 527 

Qy 481 RLSTQNYFRSLPRGTSNMTYGTFNFLGGRLMI PNTGISLLIPPDAIPRGKIYEIYLTLHK 540 

I : I : III I : III I I I I I I I I I : I I I : I I I I : I I I : : I I : : I 
Db 528 SLGSQ-HLLGLPRDPSSSVSGTFGCLGGRLTIPGTGVSLLVPNGAIPQGKFYDLYLRINK 586 

Qy 541 PEDWLPLA-GCQTLLSPIVSCGPPGVLLTRPVT LAMDHCGEPSPDSWSLRLKKQSCEGS 599 

I I I I : I I I : I I I I : I I I I : I I I I I : I : I I I I : I I I : : I 

Db 587 TEST-LPLSEGSQTVLSPSVTCGPTGLLLCRPWLTVPHCAEVIAGDWIFQLKTQAHQGH 645 

Qy 600 WEDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALSVAAAKRLKLLLFAPVAC 659 

I I : I : I I I : I I I I I I : I : : : I I I : I I : I : I I I I : I : I I I I 
Db 646 WEEWTLDEETLNTPCYCQLEAKSCHILLDQLGTYVFTGESYSRSAVKRLQLAIFAPALC 705 

Qy 660 TSLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSL 719 

I I I I I :: I I I I I II I I I I I : : I I : III I : : I I : I I I I I I I I I I I I : I I : I : 
Db 706 TSLEYSLRVYCLEDTPAALKEVLELERTLGGYLVEEPKTLLFKDSYHNLRLSLHDIPHAH 765 

Qy 720 WKSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWVWQVEGDGQSFSIN 779 

1:1111 I I I I I I I I : I I I : I : I I I I I I I I I I ::: : I I : I I I I I : I I I : : 

Db 766 WRSKLLAKYQEIPFYHVWNGSQfCALHCTFTLERHSLASTEFTCKVCVRQVEGEGQIFQLH 825 

Qy 780 FNITKDTRFAELLALESEAGVPAL — VGPSAFKIPFLIRQKIISSLDPPCRRGADWRTLA 837 

: : I I I I I I I : I I I I I I I I I I I I : I I I I 11 III II 

Db 826 TTLA-ETPAGSLDALCSAPGNAATTQLGPYAFKIPLSIRQKICNSLDAPNSRGNDWRLLA 884 

Qy 838 QKLHLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVSEAE 897 

I I I : I : I : : I I : I I I I : I I : I I I I I : I : I : I I : I : : | : : : : : : 
Db 885 QKLSMDRYLNYFATKASPTGVILDLWEARQQDDGDLNSLASALEEMGKSEMLVAMTTDGD 944 

Qy 898 C 898 

I 

Db 945 C 945 



RESULT 14 
AAW78900 

ID AAW78900 standard; protein; 943 AA. 
XX 

AC AAW78900; 
XX 

DT 25-MAR-2003 (revised) 

DT 21-DEC-1998 (first entry) 

XX 

DE Rat UNC-5 homologue UNC5H-2 . 



XX 

KW UNC-5; UNC5H-2; rat; netrin receptor; cell migration; axon guidance; 

KW diagnosis; therapy. 

XX 

OS Rattus sp. 
XX 

FH Key Location/Qualifiers 

FT Peptide 148. .161 

FT /note= "peptide used to raise rabbit polyclonal antisera" 

FT Misc~dif f erence 753 

FT /note= "encoded by CG" 

FT Peptide 909. .924 

FT /note= "peptide used to raise rabbit polyclonal antisera" 
XX 

PN WO9837085-A1. 
XX 

PD 27-AUG-1998. 
XX 

PF 19-FEB-1998; 98WO-US00314 3 . 
XX 

PR 19-FEB-1997; 97US-00808982 . 
XX 

PA (REGC ) UNIV CALIFORNIA. 
XX 

PI Tessier-Lavigne M, Leonardo ED, Hinck L, Masu M, Keinomasu K; 
XX 

DR WPI; 1998-495364/42. 

DR N-PSDB; AAV52942. 
XX 

PT Netrin-binding, vertebrate proteins - useful for diagnosis, therapy and 

PT the biopharmaceutical industry. 

XX 

PS Claim 1; Page 24-26; 32pp; English. 
XX 

CC UNC5H-1 and UNC5H-2 (see AAW78900) are rat homologues of Caenorhabditis 

CC elegans UNC-5 protein. Their amino acid sequences were deduced from 

CC isolated unc5h cDNA clones (see AAV52940 and AAV52942) isolated from an 

CC E18 brain cDNA library. The predicted proteins show similarity with UNC- 

CC 5, possess 2 predicted Ig-like domains and 2 predicted thrombospondin 

CC type-1 repeats, a predicted membrane spanning region, and a large 

CC intracellular domain. They are predicted to be involved in cell migration 

CC and axon guidance, and are characterised as receptor proteins for 

CC netrins. Human UNC5H-1 (see AAW78899) and UNC5H-2 (see AAW78901) proteins 

CC are also claimed. Vertebrate UNC-5 proteins may be produced recombinantly 

CC from transfected host cells. The invention also provides unc-5 

CC hybridisation probes and primers, vertebrate UNC-5-speci f ic binding 

CC agents such as specific antibodies, and methods of making and using the 

CC subject compositions in diagnosis (e.g. genetic hybridisation screens for 

CC vertebrate unc-5 transcripts), therapy (e.g. gene therapy to modulate 

CC vertebrate unc-5 gene expression) and in the biopharmaceutical industry 

CC (e.g. as immunogens, reagents for modulating cell guidance, reagents for 

CC screening chemical libraries for lead pharmacological agents, etc.). 

CC (Updated on 25-MAR-2003 to correct PI field.) 

XX 

SQ Sequence 943 AA; 



Query Match 



53.7%; Score 2571.5; DB 2; Length 943; 



Best Local Similarity 53.3%; Pred. No. 5.8e-205; 

Matches 504; Conservative 142; Mismatches 221; Indels 79; Gaps 16; 



Qy 9 PALLGIVLAAWLRGSGAQQSATVANPVPGANPDLLPHFLVEPEDVYIVKNKPVLLVCKAV 68 

I : I I I I I I I : : II : I I I I I : I I I I I I I I I I I I I I : I 

Db 21 PSLAGI DSGAQ GLPDSFPSAPAEQLPHFLLEPEDAYIVKNKPVELHCRAF 70 

Qy 69 PATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLEEYWCQCVA 128 

I I I I I : I I I I I I I I I I I : I I : : I I I I : I I I I I I I I : : I I I I : I I I I I I I 

Db 71 PATQIYFKCNGEWVSQKGHVTQESLDEATGLRIREVQIEVSRQQVEELFGLEDYWCQCVA 130 

Qy 129 WSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPP7VEVEWLRNED 188 

I I I I I I I I I : : I I I I I I I I I I I : I I I I I I I I I : : : I I I I I I I : I I I I I I I : I I I 

Db 131 WSSSGTTKSRRAYIRIAYLRKNFDQEPLAKEVPLDHEVLLQCRPPEGVPVAEVEWLKNED 190 

Qy 18 9 LVDPSLDPNWITREHSLWRQARLADTANYTC 24 8 

: : I I : I I : I : I : I : : I I I I I : I I I I I I I I I I I I I I : I I I : I I I I I I I I I I I : I 
Db 191 VIDPAQDTNFLLTIDHNLIIRQARLSDTANYTCVAKNIVAKRRSTTATVIVYVNGGWSSW 250 

Qy 249 TEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGSWSPWSKWS 308 

III II I I I I I I I I : I : I I I I I I I I I II I I I I I I I I I I I : I I I I I : I : I I I I I 
Db 251 AEWSPCSNRCGRGWQKRTRTCTNPAPLNGGAFCEGQACQKTACTTVCPVDGAWTEWSKWS 310 

Qy 309 ACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCV HSASGPE 357 

II : I I I I I I I I I I : I I I : I I I I I I I I III : : I : 

Db 311 ACSTEC7VHWRSRECMAPPPQNGGRDCSGTLLDSKNCTDGLCVLNQRTLNDPKSRPLEPSG 370 

Qy 358 DVALYVGL- 1 AVAVCLVLLLL VXI LVYCRKKEGLDSDVADS S - 1 LTSGFQPVS I KP S KAD 415 

I I I I I I I : I I I I : I : I :: I I I I : I : I I I I I I I I I : I : : 

Db 371 DVAL YAGLVVAVFVVLAVLMAVGVI VYRRNCRD FDT D I T D S S AALT GGFH P VN FKTARP S 430 

Qy 416 NPHLL— TIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPL 459 

I I I I : III:: I : I : II : I : I I I I II 
Db 431 NPQLLHPSAPPDLTASAGIYRGPVYALQDS-ADKIPMTNSPLLDPLPSLKIKVYDSSTIG 489 

Qy 4 60 -GGG RHTLHHSSPTSEAEEFVSRLSTQNYFRSLPRGT 4 95 

II I I I I : | : | : | | | 

Db 490 SGAGLADGADLLGVLPPGTYPGDFSRDTHFLHLRS ASLGSQ-HLLGLPRDP 539 

Qy 496 SNMTYGTFNFLGGRLMIPNTGISLLIPPDAIPRGKIYEIYLTLHKPEDVRLPLA-GCQTL 554 

I : III 111111111:111:1 I I I : I I I : : I I : : I I I I I : I I I : 

Db 540 SSSVSGTFGCLGGRLTIPGTGVSLLVPNGAIPQGKFYDLYLRINKTEST-LPLSEGSQTV 598 

Qy 555 LSPIVSCGPPGVXLTRPVII^MDHCGEPSPDSWSLRLKKQSCEGSWEDVLHLGEEAPSHL 614 

II I I : I I I I : I I I II : I : I I I I : I I I : : I I I : I : I I I : 

Db 599 LSPSWCGPTGLLLCRPWLTVPHCAEVIAGDWIFQLKTQAHQGHWEEVVTLDEETLNTP 658 

Qy 615 YYCQLEASACYVFTEQLGRFALVGEALSVAAAKRLKLLLFAPVACTSLEYNIRVYCLHDT 67 4 

I I I I I I : I : : : I I I : I I : I : I I I I : I : I I I I I I I I I : : I I I I I I I 
Db 659 CYCQLEAKSCHILLDQLGTYVFTGESYSRSAVKRLQLAIFAPALCTSLEYSLRVYCLEDT 718 

Qy 675 HDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLWKSKLLVSYQEIPFY 734 

I I I I I : : I I : III I : : I I : I I I I I I I I I I I I : I I : I : I : I I I I I I I I I I I 
Db 719 PAALKEVLELERTLGGYLVEEPKTLLFKDSYHNLRLSLHDIPHAHWRSKLIAKYQEIPFY 778 

Qy 735 HIWNGTQRYLHCTFTLERVSPSTSDLACKLWVWQVEGDGQSFSINFNITKDTRFAELLAL 7 94 

I : I I I : I : I I I I I I I II I : : : : I I : I I I I I : I I I : : : : I III 



Db 779 HWNGSQKTUjHCTFTLERHSLASTEFTCKVCVRQVEGEGQIFQLHTTLA-ETPAGSLDAL 837 



Qy 795 ESEAGVPAL — VGPSAFKI PFLIRQKIISSLDPPCRRGADWRTLAQKLHLDSHLSFFASK 852 

I I I : I I I I I I I 11111:1111 I I I I I I I I I I : I : I I I : I 

Db 838 CSAPGNAATTQLGPYAFKIPLSIRQKICNSLDAPNSRGNDWRLLAQKLSMDRYLNYFATK 8 97 

Qy 853 PSPTAMILNLWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVSEAEC 898 

I I I : I I : I I I I I : I : I : | | : | : : I : : : :: : I 
Db 898 ASPTGVILDLWEARQQDDGDLNSLASALEEMGKSEMLVAMTTDGDC 943 



RESULT 15 
AA018734 

ID AA018734 standard; protein; 933 AA. 
XX 

AC AA018734; 
XX 

DT 24-OCT-2002 (first entry) 
XX 

DE Human NOVla protein. 
XX 

KW Human; NOVX; autoimmune disease; cancer; infection; inflammatory disease; 

KW storage disorder; muscle disorder; neurodegenerative disorder; nootropic; 

KW developmental defect; neuroprotective; antiparkinsonian; hypotensive; 

KW hypertensive; haemostatic; cardiant; antianginal; dermatological ; 

KW immunosuppressive; antiinflammatory; virucide; antibacterial; anti-HIV; 

KW antiparasitic; antiallergic; antiasthmatic; antirheumatic; antiarthritic; 

KW vulnerary; anorectic; antidiabetic; immunomodulator ; antipsoriatic; 

KW nephrotropic; kerolytic; antiulcer; cerebroprotective; anticonvulsant; 

KW antiinf ertility; antimanic; antidepressant; metabolic; cytostatic; 

KW tranquilizer; analgesic. 

XX 

OS Homo sapiens. 
XX 

PN WO200257450-A2. 
XX 

PD 25-JUL-2002. 
XX 

PF 29-NOV-2001; 2001WO-US048922 . 
XX 

PR 29-NOV-2000; 2000US-0253834P . 

PR 30-NOV-2000; 2000US-0250926P . 

PR 25-JAN-2001; 2001US-0264180P . 

PR 20-AUG-2 001; 2001US-0313656P . 

PR 05-OCT-2001; 2001US-03274 56P . 

PR 28-NOV-2001; 2001US-00327456 . 
XX 

PA (CURA-) CURAGEN CORP. 
XX 

PI Edinger S, Macdougall JR, Millet I, Ellerman K, Stone DJ; 

PI Gerlach V, Grosse WM, Alsobrook JP, Lepley DM, Rieger D, Burgess CE; 

PI Casman SJ, Spytek KA, Boldog FL, Li L, Padigaru M, Mishra V; 

PI Patturajan M, Shenoy S, Rastelli L, Tchernev VT, Vernet CAM; 

PI Zerhusen BD, Malyankar UM, Guo X, Miller CE, Gangolli EA; 

XX 

DR WPI; 2002-590741/63. 

DR N-PSDB; ABT06279. 



XX 

PT Novel isolated polypeptide, designated NOVX, useful for treating or 

PT preventing in NOVX-associated disorders e.g. cardiomyopathy, 

PT atherosclerosis, diabetes, cancer, allergy, asthma, Crohn 1 s disease. 

XX 

PS Claim 1; Page 13; 353pp; English. 
XX 

CC The present invention provides the protein and coding sequences of 

CC several novel human proteins, designated NOVX. These can be used in the 

CC treatment of, amongst others, cancers, autoimmune diseases, infections, 

CC inflammatory diseases, storage disorders, muscle disorders, 

CC neurodegenerative diseases and developmental defects. The present 

CC sequence is a protein of the invention 

XX 

SQ Sequence 933 AA; 

Query Match 53.5%; Score 2563.5; DB 5; Length 933; 

Best Local Similarity 53.4%; Pred. No. 2.7e-204; 

Matches 501; Conservative 147; Mismatches 245; Indels 45; Gaps 15; 

Qy 1 MAVRPGLWPALLGIVLAAW LRGSGAQQ-SATVANPVPGANPDLLPHFLVEPEDVYIV 56 

III I I I : I I I : I I : : I I : I I : I I I I : I I I I 

Db 1 MGARSGARGALLLALLLCWDPRLSQAGTDSGSEVLPDSFPSAPAEPLPYFLQEPQDAYIV 60 

Qy 57 KNKPVLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKV 116 

I I I I I I I : I 11111:11111111 I III : I : : I I I I : I I I I I I I I : : 
Db 61 KNKPVELRCRAFPATQIYFKCNGEWVSQNDHVTQEGLDEATGLRVREVQIEVSRQQVEEL 120 

Qy 117 FGLEEYWCQCVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPPEGI 176 

I I I I : I I I II I I I I I : I I I I I : : I I : I I I I I I I I : I I I I III I : : : I I I I I I I : 
Db 121 FGLEDYWCQCVAWSSAGTTKSRRAYVRIAYLRKNFDQEPLGKEVPLDHEVLLQCRPPEGV 180 

Qy 177 P PAEVEWLRNEDLVD P S LD PNVYI T REH S L WRQARLADTAN YT CVAKN I VARRRS AS AA 236 

I I I I I I I : II I : : I I : I I : I : I : I : : I I I I I : I II I I I I I I I I I I I : I II : I 
Db 181 PVAEVEWLKNEDVIDPTQDTNFLLTIDHNLIIRQARLSDTANYTCVAKNIVAKRRSTTAT 240 

Qy 237 VIVTVNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCP 296 

I I I I I I I I I I : I I I I I I I I I I I I I I : I : I I I I I I I I I I I I I I I I I I I I I I : I I 
Db 241 VIVYVNGGWSSWAEWSPCSNRCGRGWQKRTRTCTNPAPLNGGAFCEGQAFQKTACTTICP 300 

Qy 297 VDGSWSPWSKWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVH-SASG 355 

I I I : I : I I I I I I I : I I I I I I I I I I : I I I : I I I I I :: I I I I I : III 
Db 301 VDGAWTEWSKWSACSTECAHWRSRECMAPPPQNGGRDCSGTLLDSKNCTDGLCMQLEASG 360 

Qy 356 PEDVAL YVGL- I AVAVCLVLLLLVLI LVYCRKKEGLDS DVADS S - 1 LT S GFQPVS I KP S K 413 

I I I I I I : I : I : : I : I : : I I I I : I : I I I I I . I I I I : I : : 

Db 361 — DAAL YAGL WAI FVWAI IJ^VGWVYRRNC RD FDT D I T D S S AALT GGFH P VN FKT AR 418 

Qy 414 ADNPHLL — TIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSSP 471 

I I I I : : I II : : I : I : I I : I : I I I I I I : : : I I 
Db 419 PSNPQLLHPSVPPDLTASAGIYRGPVT.ALQDS-TDKIPMTNSPLLDPLPSLKVKVYSSST 477 

Qy 472 T SEAEEFVSRLSTQNY FRS LPRGTSNMTYGTF 503 

I : : : : | | | | I I I : I I I 

Db 47 8 TGSGPGLADGADLLGVLPPGTYPSDFARDTHFLHLRSASLGSQQLLGLPRDPGSSVSGTF 537 



QY 



504 NFLGGRLMI PNTGISLLIPPDAIPRGKI YEIYLTLHKPEDVRLPLA-GCQTLLSPIVSCG 562 



I I I I I I I I I : I I I : I I I I : I I I I : I I : : I I I I I : I I I : I I I I : I I 

Db 53 8 GCLGGRLS I PGTGVSLLVPNGAI PQGKFYEMYLLINKAEST-LPLSEGTQTVLS PSVTCG 596 

Qy 563 PPGVLLTRPVILAMDHCGEPSPDSWSLRLKKQSCEGSWEDVLHLGEEAPSHLYYCQLEAS 622 

I hll Mill I II I I I : I I I : : I I I : I : I I I : I I I I I 

Db 597 PTGLLLCRPVILTMPHCAEVSARDWIFQLKTQAHQGHWEEWTLDEETLNTPCYCQLEPR 656 

Qy 62 3 ACYVFTEQLGRFALVGEALSVAAAKRLKLLLFAPVACTSLEYNIRVYCLHDTHDALKEW 682 

II:: : I I I : I I : I : I I I I : I : I I I I I I I I I : : I I I I I I I Mill: 
Db 657 ACH I LLDQLGT YVFTGES YS RS AVKRLQLAVFAPALCT S LEYS LRVYCLEDT PVALKEVL 716 

Qy 683 QLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLWKSKLLVSYQEIPFYHIWNGTQR 742 

: I I : I M I : : M : I I I I I I I I I M I : I I : I : I : I I I I I i I I I I I I I I : I : I : 
Db 717 ELERTLGGYLVEEPKPLMFKDSYHNLRLSLHDLPHAHWRSKLLAKYQEIPFYHIWSGSQK 776 

Qy 743 YLHCTFTLERVSPSTSDLACKLWVWQVEGDGQSFSINFNITKDTRFAELLALESEAG — V 800 

M I II I I I I ! : : : : I I I : I I M I : I I I : : : : I I II I I 

Db 777 ALHCTFTLERHSLASTELTCKICVRQVEGEGQIFQLHTTLA-ETPAGSLDTLCSAPGSTV 835 

Qy 801 PALVGPSAFKIPFLIRQKIISSLDPPCRRGADWRTLAQKLHLDSHLSFFASKPSPTAMIL 860 

: I I I I I I I 11111:1111 I I I I I I I I I I : I : I : : II : I I I I : I I 
Db 836 TTQLGPYAFKIPLSIRQKICNSLDAPNSRGNDWRMLAQKLSMDRYLNYFATKASPTGVIL 895 

Qy 861 NLWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVSEAEC 898 

: I I I I : | : | : | | : | : : | : : : : : : | 

Db 89 6 DLWEALQQDDGDLNSIASALEEMGKSEMLVAVATDGDC 933 



Search completed: July 12, 2004, 22:57:23 
Job time : 96 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 



Run on : 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



July 12, 2004, 22:56:00 ; Search time 27 Seconds 

(without alignments) 
1717.042 Million cell updates/sec 

US-10-624-932-2 
4791 

1 MAVRPGLWPALLGIVLAAWL AVAGL GQ P DAG L FT VS EAE C 8 98 



BLOSUM62 
Gapop 10.0 



389414 



Gapext 0 . 5 

Searched: 389414 seqs, 51625971 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2 000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

Database : Issued_Patents_AA: * 

1 : /cgn2_6/ptodata/2/iaa/5A_COMB.pep : * 

2: /cgn2_6/ptodata/2/iaa/5B_COMB.pep: * 

3 : /cgn2_6/ptodata/2/iaa/ 6A_COMB . pep : * 

4 : /cgn2_6/ptodata/2/iaa/6B_COMB.pep: * 

5: /cgn2_6/ptodata/2/iaa/PCTUS_COMB.pep: + 

6: /cgn2_6/ptodata/2/iaa/backfilesl.pep: * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
US-08-808-982-5 

Sequence 5, Application US/08808982 
Patent No. 5939271 
GENERAL INFORMATION: 

APPLICANT: Tessier-Lavigne, Marc 
APPLICANT: Leonardo, E. David 
APPLICANT: Hink, Lindsay 
APPLICANT: Masu, Masayuki 
APPLICANT: Kazuko, Keino-Masu 
TITLE OF INVENTION: Netrin Receptors 
NUMBER OF SEQUENCES: 8 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: SCIENCE & TECHNOLOGY LAW GROUP 
STREET: 268 BUSH STREET, SUITE 3200 
CITY: SAN FRANCISCO 
STATE: CALIFORNIA 
COUNTRY: USA 



ZIP: 94104 
COMPUTER READABLE FORM: 
; MEDIUM TYPE: Floppy disk 

; COMPUTER: IBM PC compatible 

; OPERATING SYSTEM: PC-DOS/MS-DOS 

; SOFTWARE: Patentln Release #1.0, Version #1.30 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/808, 982 

FILING DATE: 

CLASSIFICATION: 530 
ATTORNEY/AGENT INFORMATION: 

NAME: OSMAN, RICHARD A 

REGISTRATION NUMBER: 36,627 

REFERENCE/ DOCKET NUMBER: UC96-217 
; TELECOMMUNICATION INFORMATION: 
; TELEPHONE: (415) 343-4341 

; TELEFAX: (415) 343-4342 

; INFORMATION FOR SEQ ID NO: 5: 
; SEQUENCE CHARACTERISTICS: 

LENGTH: 898 amino acids 
; TYPE: amino acid 

; STRANDEDNESS: not relevant 

; TOPOLOGY: not relevant 

MOLECULE TYPE: peptide 
US-08-808-982-5 

Query Match 96.8%; Score 4638; DB 2; Length 898; 

Best Local Similarity 96.0%; Pred. No. 0; 

Matches 862; Conservative 17; Mismatches 19; Indels 0; Gaps 0 
Qy 1 MAVRPGLWPALLGIVLAAWLRGSGAQQSATVANPVPGANPDLLPHFLVEPEDVYIVKNKP 60 



Db 



I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1 MAWPGLWPVLLGIVLAAWLRGSGAQQSATYANPVPGANPDLLPHFLVEPEDVYIVKNKP 60 



Qy 



Db 



61 VLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLE 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
61 VLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDSSSGLPTMEVRINVSRQQVEKVFGLE 120 



Qy 



Db 



121 EYWCQCVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAE 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
121 EYWCQCVAWSSSGTTKSQKAYIRIAYLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAE 180 



Qy 



Db 



181 VEWLRNEDLVDPSLDPNVYITREHSLWRQARLADTANYTCVAKNIV7VRRRSASAAVIVY 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
181 VEWLRNEDLVDPSLDPNVYITREHSLWRQARLADTANYTCVAKNIVARRRSTSAAVIVY 240 



Qy 



Db 



241 VNGGWSTWTEWSVCSAS CGRGWQKRS RSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGS 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
241 WGGWSTWTEWSVCSAS CGRGWQKRS RSCTN PAP LNGGAFCEGQNVQKTACATLCPVDGS 300 



Qy 



301 WSPWSKWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSASGPEDVA 360 





Db 



301 WSSWSKWSACGLDCTHWRSRECSDPAPRNGGEECRGADLDTRNCTSDLCLHTASCPEDVA 360 



Qy 

Db 



361 
361 



L YVGLI AVAVCLVLLLLVLI LVYCRKKEGLDS DVADS S I LT S GFQPVS I KP S KADN PHLL 
I I : I I : I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
LYIGLVAVAVCLFLLLLALGLIYCRKKEGLDSDVADSSILTSGFQPVSIKPSKADNPHLL 



420 
420 



Qy 421 TIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSSPTSEAEEFVS 480 

I I I I I II I I I I I I I I I I I I I I I I II I I I I : I I I I I I I I I I I I I I I I I I I I I II I : I I I 

Db 421 TIQPDLSTTTTTYQGSLCSRQDGPSPKFQLSNGHLLSPLGSGRHTLHHSSPTSEAEDFVS 4 80 

Qy 481 RLSTQNYFRSLPRGTSNMTYGTFNFLGGRLMIPNTGISLLIPPDAIPRGKIYEIYLTLHK 540 

I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 481 RLSTQNYFRSLPRGTSNMAYGTFNFLGGRLMIPNTGISLLIPPDAIPRGKIYEIYLTLHK 540 

Qy 541 PEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVI LAMDHCGEPSPDSWSLRLKKQSCEGSW 600 

I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 541 PEDVRLPLAGCQTLLSPWSCGPPGVLLTRPVI LAMDHCGEPSPDSWSLRLKKQSCEGSW 600 

Qy 601 EDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALSVAAAKRLKLLLFAPVACT 660 

I I I I I I I I I : I II II I II I I I I I I I I I I I I II I I I I I I I I I I I I I I I : I I I I I I I I I I 

Db 601 EDVLHLGEESPSHLYYCQLEAGACYVFTEQLGRFALVGEALSVAATKRLRLLLFAPVACT 660 

Qy 661 SLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLW 72 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 661 SLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLW 72 0 

Qy 721 KSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWVWQVEGDGQSFSINF 780 

I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I : : I I I I I I I I : I I I I I I I I I I I I : I I I 

Db 721* KSKLLVSYQEIPFYHIWNGTQQYLHCTFTLERINASTSDLACKVWVWQVEGDGQSFNINF 780 

Qy 781 NITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISSLDPPCRRGADWRTLAQKL 840 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I 

Db 781 NITKDTRFAELLALESEGGVP7VLVGPSAFKIPFLIRQKIIASLDPPCSRGADWRTLAQKL 840 

Qy 841 HLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVSEAEC 898 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 841 HLDSHLSFFASKPSPTAMILNLWEARHFPNGNLGQLAAAVAGLGQPDAGLFTVSEAEC 898 



RESULT 2 

US-09-306-902A-5 

; Sequence 5, Application US/09306902A 
; Patent No. 6277585 

GENERAL INFORMATION: 
; APPLICANT: Tessier-Lavigne, Marc 

; Leonardo, E. David 

; Hink, Lindsay 

; Masu, Masayuki 

; Kazuko, Keino-Masu 

TITLE OF INVENTION: Netrin Receptors 
NUMBER OF SEQUENCES: 9 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: SCIENCE & TECHNOLOGY LAW GROUP 
STREET: 268 BUSH STREET, SUITE 3200 
CITY: SAN FRANCISCO 
STATE: CALIFORNIA 
COUNTRY: USA 
ZIP: 94104 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 



SOFTWARE : Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/306, 902A 
FILING DATE: 07-May-1999 
CLASSIFICATION: <Unknown> 
ATTORNEY/AGENT INFORMATION: 
NAME: OSMAN, RICHARD A 
REGISTRATION NUMBER: 36,627 
REFERENCE/ DOCKET NUMBER: UC96-217 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (415) 343-4341 
TELEFAX: (415) 343-4342 
INFORMATION FOR SEQ ID NO: 5: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 8 98 amino acids 
TYPE: amino acid 
STRANDEDNESS: not relevant 
TOPOLOGY: not relevant 
MOLECULE TYPE: peptide 
SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
US-09-306-902A-5 

Query Match 96.8%; Score 4638; DB 3; Length 898; 

Best Local Similarity 96.0%; Pred. No. 0; 

Matches 862; Conservative 17; Mismatches 19; Indels 0; Gaps 0 

Qy 1 MAWPGLWPALLGIVLAAWLRGSGAQQSATVANPVPGANPDLLPHFLVEPEDVYIVKNKP 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MAVRPGLWPVXLGIVLAAWLRGSGAQQSATVANPVPGANPDLLPHFLVEPEDVYIVKNKP 60 

Qy 61 VLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLE 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 VLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDSSSGLPTMEVRINVSRQQVEKVFGLE 12 0 

Qy 121 EYWCQCVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAE 18 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 EYWCQCVAWSSSGTTKSQKAYIRIAYLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAE 180 

Qy 181 VEWLRNEDLVT)PSLDPNVYITREHSLWRQARLADTANYTCVAKNIVARRRSASAAVIVY 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 VEWLRNEDLVDPSLDPNVYITREHSLWRQARLADTANYTCVAKNIVARRRSTSAAVIVY 24 0 

Qy 241 VNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGS 300 

I I I I I I I II I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 241 VNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGS 300 

Qy 301 WSPWSKWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSASGPEDVA 360 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I : I : I I I I I I I 

Db 301 WSSWSKWSACGLDCTHWRSRECSDPAPRNGGEECRGADLDTRNCTSDLCLHTASCPEDVA 360 

Qy 361 L YVGL I AVAVCLVLLLLVL I LVYCRKKEGLDSDVAD S S I LT S GFQ PVS I KP S KADNPH LL 42 0 

I I : I I : I I I I I I I I I I I I : I I I I I I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I 
Db 361 LYIGLVAVAVCLFLLLLALGLIYCRKKEGLDSDVADSSILTSGFQPVSIKPSKADNPHLL 420 



Qy 

Db 



421 
421 



TIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSSPTSEAEEFVS 480 
I I I I I I I I I I I I I I I I I I | | | I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I : I I I 
TIQPDLSTTTTTYQGSLCSRQDGPSPKFQLSNGHLLSPLGSGRHTLHHSSPTSEAEDFVS 480 



Qy 


481 


RLSTQNYFRSLPRGTSNMTYGTFNFLGGRLMI PNTGI SLLI PPDAI PRGKI YEI YLTLHK 


540 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 




Db 


481 


RLSTQNYFRSLPRGTSNMAYGTFNFLGGRLMI PNTGI SLLI PPDAI PRGKI YEI YLTLHK 


540 


Qy 


541 


PEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVILAMDHCGEPSPDSWSLRLKKQSCEGSW 


600 






1 I 1 1 1 1 1 1 1 1 1 1 II 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


541 


PEDVRLPLAGCQTLLSPWSCGPPGVLLTRPVILAMDHCGEPSPDSWSLRLKKQSCEGSW 


600 


Qy 


601 


EDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALSVAAAKRLKLLLFAPVACT 


660 






1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 i 1 1 1 1 1 




Db 


601 


EDVLHLGEESPSHLYYCQLEAGACYVFTEQLGRFALVGEALSVAATKRLRLLLFAPVACT 


660 


Qy 


661 


SLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLW 


720 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 | 1 




Db 


661 


SLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLW 


720 


Qy 


721 


KSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWVWQVEGDGQSFSINF 


780 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 II 1 1 1 1 1 :: 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 




Db 


721 


KSKLLVSYQEIPFYHIWNGTQQYLHCTFTLERINASTSD^CKVWVWQVEGDGQSFNINF 


780 


Qy 


781 


NITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISSLDPPCRRGADWRTLAQKL 


840 






1 1 1 1 1 1 1 1 1 1 I 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


7.81 


NITKDT.RFAELLALESEGGVPALVGPSAFKI PFLIRQKI IASLDPPCSRGADWRTLAQKL 


840 


Qy 


841 


HLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVSEAEC 898 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


841 


HLDSHLSFFASKPSPTAMILNLWEARHFPNGNLGQLAAAVAGLGQPDAGLFTVSEAEC 8 98 



RESULT 3 
US-08-808-982-6 

Sequence 6, Application US/08808982 
Patent No. 5939271 
GENERAL INFORMATION: 

APPLICANT: Tessier-Lavigne, Marc 
APPLICANT: Leonardo, E. David 
APPLICANT: Hink, Lindsay 
APPLICANT: Masu, Masayuki 
APPLICANT: Kazuko, Keino-Masu 
TITLE OF INVENTION: Netrin Receptors 
NUMBER OF SEQUENCES: 8 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: SCIENCE & TECHNOLOGY LAW GROUP 
STREET: 268 BUSH STREET, SUITE 3200 
CITY: SAN FRANCISCO 
STATE: CALIFORNIA 
COUNTRY: USA 
ZIP: 94104 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/808,982 
FILING DATE: 



CLASSIFICATION: 530 
ATTORNEY/ AGENT INFORMATION: 
NAME: OSMAN, RICHARD A 
REGISTRATION NUMBER: 36,627 
REFERENCE/ DOCKET NUMBER: UC96-217 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (415) 343-4341 
TELEFAX: (415) 343-4342 
INFORMATION FOR SEQ ID NO: 6: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 557 amino acids 
TYPE: amino acid 
STRANDEDNESS: not relevant 
TOPOLOGY: not relevant 
MOLECULE TYPE: peptide 
US-08-808-982-6 

Query Match 58.8%; Score 2815.5; DB 2; Length 557; 

Best Local Similarity 96.8%; Pred. No. 5.2e-259; 

Matches 539; Conservative 2; Mismatches 15; Indels 1; Gaps 1 

Qy 343 NCTSDLCVHSASGPEDVALYVGLIAVAVCLVLLLLVLILVYCRKKEGLDSDVADSSILTS 4 02 

I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 NCTSDLX VliTAS GPEDVAL WGLIAVAVCLVLLLLVLI LVYCRKKEGLDS DVADS S I LTS 60 

Qy 403 GFQPVSIKPSKADNPHLLTIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGG 462 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 GFQPVSIKPSKADNPHLLTIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGG 120 

Qy 463 RHTLHHSSPTSEAEEFVSRLSTQNYFRSLPRGTSNMTYGTFNFLGGRLMIPNTGISLLIP 522 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 RHTLHHSSPTSEAEEFVSRLSTQNYFRSLPRGTSNMTYGTFNFLGGRLMIPNTGISLLIP 180 

Qy 523 PDAIPRGKIYEIYLTLHKPEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVILAMDHCGEP 582 

I I I I I I I I I I II I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
Db 181 PDAIPRGKIYEIYLTLHKPEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVI LAMDHCGEP 240 

Qy 583 SPDSWSLRLKKQSCEGSWEDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALS 642 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 SPDSWSLALKKQSCEGSWEDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALS 300 

Qy 643 VAAAKRLKLLLFAPVACTSLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFK 7 02 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 301 VAAAKRLKLLLFAPVACTSLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHLX 360 

Qy 703 DSYHNLRLSIHDVPSSLWKSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLAC 762 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 DSYHNLXLSXHDVPSSLWKSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLAC 420 

Qy 763 KLWVWQVEGDGQSFSINFNITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISS 822 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 KLWVWQVEGDGQSFSINFNITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISS 480 



QY 
Db 



823 
481 



LDPPCRRGADWRTLAQKLHLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAG 882 
I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
LDPPCRRGADWRTLAQKLHLDSHLSFFASKPSPTAMILNLWE7VRHFPNGNLSQLAAAVAG 54 0 



Qy 883 LGQPDAGLFT-VSEAEC 8 98 

I : I I I I I 
Db 541 TXPAGRWLLSQCSEAEC 557 



RESULT 4 

US-09-306-902A-6 

; Sequence 6, Application US/09306902A 

; Patent No. 6277585 

; GENERAL INFORMATION: 

; APPLICANT: Tessier-Lavigne, Marc 

; Leonardo, E. David 

; Hink, Lindsay 

; Masu, Masayuki 

; Kazuko, Keino-Masu 

TITLE OF INVENTION: Netrin Receptors 
NUMBER OF SEQUENCES: 9 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: SCIENCE & TECHNOLOGY LAW GROUP 
STREET: 268 BUSH STREET, SUITE 3200 
; CITY: SAN FRANCISCO 

STATE: CALIFORNIA 
COUNTRY: USA 
ZIP: 94104 
; COMPUTER READABLE FORM: 

; MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 
SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/306, 902A 
FILING DATE: 07-May-1999 
CLASSIFICATION: <Unknown> 
; ATTORNEY/ AGENT INFORMATION: 

NAME: OSMAN, RICHARD A 
; REGISTRATION NUMBER: 36,627 

; REFERENCE/ DOCKET NUMBER: UC96-217 

; TELECOMMUNICATION INFORMATION: 

TELEPHONE: (415) 343-4341 
TELEFAX : (415) 343-4342 
INFORMATION FOR SEQ ID NO: 6: 
SEQUENCE CHARACTERISTICS: 
; LENGTH: 557 amino acids 

TYPE: amino acid 
; STRANDEDNESS: not relevant 

; TOPOLOGY: not relevant 

MOLECULE TYPE: peptide 
SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
US-09-306-902A-6 

Query Match 58.8%; Score 2815.5; DB 3; Length 557; 

Best Local Similarity 96.8%; Pred. No. 5.2e-259; 

Matches 539; Conservative 2; Mismatches 15; Indels 1; Gaps 1; 

Qy 34 3 NCTSDLCVHSASGPEDVALYVGLIAVAVCLVLLLLVLILVYCRKKEGLDSDVADSSILTS 402 

I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 NCTSDLXVHTASGPEDVALYVGLIAVAVCLVLLLLVLILVYCRKKEGLDSDVADSSILTS 60 



Qy 

Db 



403 
61 



GFQPVSIKPSKADNPHLLTIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGG 4 62 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

GFQPVSIKPSKADNPHLLTIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGG 120 



Qy 463 RHTLHHSSPTSEAEEFVSRLSTQNYFRSLPRGTSNMTYGTFNFLGGRLMIPNTGISLLIP 522 

I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 

Db 121 RHTLHHSSPTSEAEEFVSRLSTQNYFRSLPRGTSNMTYGTFNFLGGRLMI PNTGI SLLI P 180 

Qy 523 PDAIPRGKIYEIYLTLHKPEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVILAMDHCGEP 582 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 

Db 181 PDAIPRGKIYEIYLTLHKPEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVI LAMDHCGEP 240 

Qy 583 SPDSWSLRLKKQSCEGSWEDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALS 642 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 SPDSWSLALKKQSCEGSWEDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALS 300 

Qy 643 VAAAKRLKLLLFAPVACTSLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFK 702 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I 
Db 301 VAAAKRLKLLLFAPVACTSLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHLX 360 

Qy 703 DSYHNLRLSIHDVPSSLWKSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLAC 762 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I J I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 DSYHNLXLSXHDVPSSLWKSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLAC 420 

Qy 7 63 KLWVWQVEGDGQSFSINFNITKDTRFAELL7VLESEAGVPALVGPSAFKIPFLIRQKIISS 822 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 KLWWQVEGDGQS FS INFNITKDTRFAELLALESEAGVPALVGPSAFKI PFLI RQKI I S S 480 

Qy 823 LDPPCRRGADWRT]^QKLHLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLA7^AVAG 882 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 481 LDPPCRRGADWRTLAQKLHLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAG 540 

Qy 883 LGQPDAGLFT-VSEAEC 898 

I : I I I I I 
Db 541 TXPAGRWLLSQCSEAEC 557 



RESULT 5 
US-08-808-982-7 

Sequence 7 , Application US/08808982 
Patent No. 5939271 
GENERAL INFORMATION: 

APPLICANT: Tessier-Lavigne, Marc 
APPLICANT: Leonardo, E. David 
APPLICANT: Hink, Lindsay 
APPLICANT: Masu, Masayuki 
APPLICANT: Kazuko, Keino-Masu 
TITLE OF INVENTION: Netrin Receptors 
NUMBER OF SEQUENCES: 8 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: SCIENCE & TECHNOLOGY LAW GROUP 
STREET: 268 BUSH STREET, SUITE 3200 
CITY: SAN FRANCISCO 
STATE: CALIFORNIA 
COUNTRY: USA 
ZIP: 94104 



COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/808 , 982 
FILING DATE: 
CLASSIFICATION: 530 
ATTORNEY/AGENT INFORMATION: 
NAME: OSMAN, RICHARD A 
REGISTRATION NUMBER: 36,627 
REFERENCE/ DOCKET NUMBER: UC96-217 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (415) 343-4341 
TELEFAX: (415) 343-4342 
INFORMATION FOR SEQ ID NO: 7: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 943 amino acids 
TYPE: amino acid 
STRANDEDNESS : not relevant 
TOPOLOGY: not relevant 
MOLECULE TYPE: peptide 
US-08-808-982-7 

Query Match 53.7%; Score 2571.5; DB 2; Length 943; 

Best Local Similarity 53.3%; Pred. No. 2.2e-235; 

Matches 504; Conservative 142; Mismatches 221; Indels 79; Gaps 16; 

Qy 9 PALLGIVI^WLRGSGAQQSATVANPVPGANPDLLPHFLVEPEDVTIVKNKPVLLVCKAV 68 

1:111 I I I I : : II : I I I I I : I I I I I I I I I I I I I i : I 

Db 21 PSLAGI DSGAQ GLPDSFPSAPAEQLPHFLLEPEDAYIVKNKPVELHCRAF 70 

Qy 69 PATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLEEYWCQCVA 128 

I I I I I : I I I I I I I I I II : I I : : I I I I : I I I I I I I I :: I I I I : I I I I I I I 
Db 71 PATQI YFKCNGEWVSQKGHVTQESLDEATGLRI REVQI EVSRQQVEELFGLEDYWCQCVA 130 

Qy 129 WSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAEVEWLRNED 188 

I I I I I I I I I :: I I I I I I I I I I I : I I I I I I I I I : : : I I I I I I I : I I I I I I I : I I I 

Db 131 WSSSGTTKSRRAYIRIAYLRKNFDQEPLAKEVPLDHEVLLQCRPPEGVPVAEVEWLKNED 190 

Qy 189 LVDPSLDPNWITREHSLWRQARLADTANYTCVAKNIVARRRSASAAVIVYWGGWSTW 248 

: : I I : I I : I : I : I :: I I I I I : I I I I I I I I I I I I I I : I I I : I I I I I I I I I I I : I 
Db 191 VIDPAQDTNFLLTIDHNLIIRQARLSDTANYTCVAKNIVAKRRSTTATVIVYVNGGWSSW 250 

Qy 24 9 TEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGSWSPWSKWS 308 

III II I I I I I I I I : I : I I I I I I I I I I I I I I I I I I I I I I : I I I I I : I : I I I I I 
Db 251 AEWSPCSNRCGRGWQKRTRTCTNPAPLNGGAFCEGQACQKTACTTVCPVDGAWTEWSKWS 310 

Qy 309 ACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCV HSASGPE 357 

II :| lllllll I I : I I I : I M I I : : I I I III : : |: 

Db 311 ACSTECAHWRSRECMAPPPQNGGRDCSGTLLDSKNCTDGLCVLNQRTLNDPKSRPLEPSG 370 

Qy 358 DVALYVGL- IAVAVCLVLLLLVLI LVYCRKKEGLDSDVADS S - 1 LTSGFQPVS I KPSKAD 415 

I M I I II : I I I I : I : I : : II I |:|: Ml II II II: I :: 

Db 371 DVALYAGLWAVFWIAVLMAVGVIVYRRNCRDFDTDITDSSAALTGGFHPVNFKTARPS 430 



Qy 416 NPHLL — TIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPL 459 

I I I I : III:: I : I : II : I Ml I I I I 
Db 431 NPQLLHPSAPPDLTASAGIYRGPVYALQDS-ADKIPMTNSPLLDPLPSLKIKVYDSSTIG 489 

Qy 460 -GGG RHTLHHSSPTSEAEEFVSRLSTQNYFRSLPRGT 4 95 

II I I I I : | : | : | M 

Db 490 SGAGLADGADLLGVLPPGTYPGDFSRDTHFLHLRS ASLGSQ-HLLGLPRDP 539 

Qy 4 96 SNMTYGT FNFLGGRLMIPNTGISLLIPPDAIPRGKIYEIYLTLHKPEDVRLPLA-GCQTL 554 

I : III I I I I I I I I I : I I I : I I I I : I I I : : I I : : I I I I I : I I I : 
Db 540 SSSVSGTFGCLGGRLTIPGTGVSLLVPNGAIPQGKFYDLYLRINKTEST-LPLSEGSQTV 598 

Qy 555 LSPIVSCGPPGVLLTRPVI LAMDHCGEPSPDSWSLRLKKQSCEGSWEDVLHLGEEAPSHL 614 

I I I I : I I I I : I I I I I : I : I I I I : I I I : : I I I : I : I I I : 

Db 599 LSPSVTCGPTGLLLCRPWLTVPHCAEVIAGDWIFQLKTQAHQGHWEEWTLDEETLNTP 658 

Qy 615. YYCQLEASACYVFTEQLGRFALVGEALSVAAAKRLKLLLFAPVACTSLEYNIRVYCLHDT 674 

I II I I I : I : : : I I I : I I : I : I I I I : I : I I I I I I I I I : : I I I I I I I 
Db 659 CYCQLEAKSCHILLDQLGTYVFTGESYSRSAVKRLQLAIFAPALCTSLEYSLRVYCLEDT 718 

Qy 675 HDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLWKSKLLVSYQEIPFY 734 

I I I I I : : I I : III I : : I I : I I I I I I I I I I I I : I I : I : I : I I I I I I I I I I I 
Db 719 PAALKEVLELERTLGGYLVEEPKTLLFKDSYHNLRLSLHDIPHAHWRSKLLAKYQEIPFY 778 

Qy 735 HIWNGTQRYLHCTFTLERVSPSTSDIACKLWVWQVEGDGQSFSINFNITKDTRFAELLAL 794 

I : I I I : I : I I I I I I I I I I : : : : I I : I I I I I : I I I : : : : I III 
Db 779 HVWN G S QKALH CT FT L E RH S LAS T E FT C KVCVRQVEGEGQ I FQ LHTT LA- ET PAG S L DAL 837 

Qy 795 ESEAGVPAL — VGPSAFKIPFLIRQKIISSLDPPCRRGADWRTLAQKLHLDSHLSFFASK 852 

I I I MINIM I I I I I M I I I I I II I I I I I I M M :: I I M 
Db 838 CSAPGNAATTQLGPYAFKIPLSIRQKICNSLDAPNSRGNDWRLLAQKLSMDRYLNYFATK 897 

Qy 853 PSPTAMILNLWEARHFPNGNLSQL7WVVAGLGQPDAGLFTVSEAEC 8 98 

I I I M I M II I I : I M : I I M : M : : : : : M 
Db 898 ASPTGVILDLWEARQQDDGDLNSLASALEEMGKSEMLVAMTTDGDC 943 



RESULT 6 

US-09-306-902A-7 

; Sequence 7 , Application US/09306902A 
; Patent No. 6277585 

GENERAL INFORMATION: 

APPLICANT: Tes sier-Lavigne, Marc 
; Leonardo, E. David 

; Hink, Lindsay 

; Masu, Masayuki 

; Kazuko, Keino-Masu 

TITLE OF INVENTION: Netrin Receptors 
; NUMBER OF SEQUENCES: 9 

; CORRESPONDENCE ADDRESS: 

; ADDRESSEE: SCIENCE & TECHNOLOGY LAW GROUP 

STREET: 2 68 BUSH STREET, SUITE 3200 
CITY: SAN FRANCISCO 
STATE: CALIFORNIA 
COUNTRY: USA 
ZIP: 94104 
COMPUTER READABLE FORM: 



MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 
; SOFTWARE: Patentln Release #1.0, Version #1.30 

; CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/306, 902A 

FILING DATE: 07-May-1999 

CLASSIFICATION: <Unknown> 
ATTORNEY/AGENT INFORMATION: 

NAME: OSMAN, RICHARD A 

REGISTRATION NUMBER: 36,627 

REFERENCE/ DOCKET NUMBER: UC96-217 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (415) 343-4341 
; TELEFAX: (415) 343-4342 

INFORMATION FOR SEQ ID NO: 7: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 943 amino acids 
; TYPE: amino acid 

; STRANDEDNESS: not relevant 

TOPOLOGY: not relevant 
; MOLECULE TYPE: peptide 

; SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

US-09-306-902A-7 

Query Match 53.7%; Score 2571.5; DB 3; Length 943; 

Best Local Similarity 53.3%; Pred. No. 2.2e-235; 

Matches 504; Conservative 142; Mismatches 221; Indels 79; Gaps 16; 
Qy 9 P AL LG I VLAAW L RG S GAQQ SAT VAN P VP GAN P D L L P H FL VE P EDVY I VKN K P VL LVC KAV 68 



Db 



21 PSLAGI 



I I I I : : II : I I I I I : I I I I I I I I I I I I I I : I 

DSGAQ GLPDSFPSAPAEQLPHFLLEPEDAYIVKNKPVELHCRAF 70 



Qy 



Db 



69 PATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLEEYWCQCVA 12 8 

I I I I I : I I I I I I I I I II : I I : : I I I I : I I I I I I I I : : I I I I : I I I I I I I 
71 PATQIYFKCNGEWVSQKGHVTQESLDEATGLRIREVQIEVSRQQVEELFGLEDYWCQCVA 130 



Qy 



Db 



12 9 WSSSGTTKSQKAYIRIARLRKNFEQEPL7VKEVSLEQGIVLPCRPPEGIPPAEVEWLRNED 188 

I I I I I I I I I : : I I I I I I I II I I : I I I I I I I I I : : : I I I I I I I : I I I I I I I : I I I 
131 WSSSGTTKSRRAYIRIAYLRKNFDQEPLAKEVPLDHEVLLQCRPPEGVPVAEVEWLKNED 190 



Qy 



Db 



18 9 LVDPSLDPNVYITREHSLVVRQARLADTANYTCVAKNIVAJ^R 248 

: : II : I I : I : I : I : : I I I I I : I I I I I I I I I I I I I I : I II : I I I I I I I I I I I : I 
191 VIDPAQDTNFLLTIDHNLIIRQARLSDTANYTCVAKNIVAKRRSTTATVIVYVNGGWSSW 250 



Qy 



Db 



24 9 TEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGSWSPWSKWS 308 

I II II I I I I I I I I : I : I I I I I I I I I I I I I I I I I I I I I I : I I I I I : I : I I I I I 
251 AEWSPCSNRCGRGWQKRTRTCTNPAPLNGGAFCEGQACQKTACTTVCPVDGAWTEWSKWS 310 



309 ACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCV HSASGPE 357 

II : I I I I I I I I I I : I I I : I I I I I :: I I I Ml : : I : 




Db 



Qy 

Db 



358 DVALYVGL- 1 AVAVCLVLLLLVLI LVYCRKKEGLDS DVADS S - 1 LT S GFQPVS I KP SKAD 415 

I I II I II : I I I I : I : I :: I I I I : I : I I I I I I I I I : I : : 

371 DVALYAGLWAVFWLAVmAVGVIVYRRNCRDFDTDITDSSAALTGGFHPVNFKTARPS 430 



Qy 

Db 



416 NPHLL— TIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPL 459 

I I I I : III:: I : I : II : I : I I II I I 
431 NPQLLHPSAPPDLTASAGIYRGPVYALQDS-ADKIPMTNSPLLDPLPSLKIKVYDSSTIG 489 



Qy 460 -GGG RHTLHHSSPTSEAEEFVSRLSTQNYFRSLPRGT 495 

II I I I I : I : I : I I I 

Db 490 SGAGLADGADLLGVLPPGTYPGDFSRDTHFLHLRS ASLGSQ-HLLGLPRDP 539 

Qy 496 SNMTYGTFNFLGGRLMI PNTGI SLLI PPDAI PRGKI YEI YLTLHKPEDVRLPLA-GCQTL 554 

I : III I I I II I I I I : I I I : I I I I : I I I : : I I : : I I I I I : I I I : 

Db 540 SSSVSGTFGCLGGRLTIPGTGVSLLVPNGAIPQGKFYDLYLRINKTEST-LPLSEGSQTV 598 

Qy 555 LSPIVSCGPPGVLLTRPVILANDHCGEPSPDSWSLRLKKQSCEGSWEDVLHLGEEAPSHL 614 

I I I I : I I I I : I I I I I : I : I I I I : I I I : : I I I : I : I I I : 

Db 599 LSPSVTCGPTGLLLCRPWLTVPHCAEVIAGDWIFQLKTQAHQGHWEEWTLDEETLNTP 658 

Qy 615 YYCQLEASACYVFTEQLGRFALVGE7VLSVAAAKRLKLLLFAPVACTSLEYNIRVYCLHDT 674 

I I I I I I : I :: : I I I : I I : I : I I I I : I : I I I I I I II I : : I I I I I I I 
Db 659 CYCQLEAKSCHILLDQLGTYVFTGESYSRSAVKRLQLAIFAPALCTSLEYSLRVYCLEDT 718 

Qy 675 HDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLWKSKLLVSYQEIPFY 734 

I I I I I : : I I : III I : : I I : I I I I I I I I I I I I : I I : I : I : I I I I I I I I I I I 
Db 719 PAALKEVLELERTLGGYLVEEPKTLLFKDSYHNLRLSLHDIPH7\HWRSKLIAKYQEIPFY 778 

Qy 735 HIWNGTQRYLHCTFTLERVSPSTSDLACKLWVWQVEGDGQSFSINFNITKDTRFAELLAL 794 

I : I I I : I : I I I I I I I I I I : : : : I I : I I I I I : I I I : : : : I III 
Db 779 HVWNGSQKALHCTFTLERHSLASTEFTCKVCVRQVEGEGQIFQLHTTLA-ETPAGSLDAL 837 

Qy 795 ESEAGVPAL — VGPSAFKIPFLIRQKIISSLDPPCRRGADWRTLAQKLHLDSHLSFFASK 852 

I I I : I I I I I I I 11111:1111 I I I II I I I I I : I : I : : I I : I 
Db 838 CSAPGNAATTQLGPYAFKIPLSIRQKICNSLDAPNSRGNDWRLLAQKLSMDRYLNYFATK 897 

Qy 853 PSPTAMILNLWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVSEAEC 898 

I I I : II : I I I I I : | : | : | i : | : : | : : : : : : | 
Db 898 ASPTGVILDLWEARQQDDGDLNSLASALEEMGKSEMLVAMTTDGDC 943 



RESULT 7 

US-08-313-288B-19 

; Sequence 19, Application US/08313288B 
; Patent No. 5750502 
; GENERAL INFORMATION: 

APPLICANT: Jessell, Thomas M. and Avihu Klar 
TITLE OF INVENTION: CLONING, EXPRESSION AND USES OF A 
TITLE OF INVENTION: NOVEL SECRETED PROTEIN, F-SPONDIN 
NUMBER OF SEQUENCES: 20 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE: Cooper & Dunham LLP 

; STREET: 1185 Avenue of the Americas 

CITY: New York 
; STATE: New York 

COUNTRY: USA 
ZIP: 10036 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
; COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 



; SOFTWARE: Patentln Release #1.0, Version #1.30 

CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/313, 288B 

FILING DATE: January 5, 1995 

CLASSIFICATION: 435 
ATTORNEY/AGENT INFORMATION: 

NAME: White, John P. 

REGISTRATION NUMBER: 28,678 

REFERENCE/ DOCKET NUMBER: 40028-A-PCT-US 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (212) 278-0400 

TELEFAX: (212) 391-0526 
; TELEX: 

; INFORMATION FOR SEQ ID NO: 19: 
; SEQUENCE CHARACTERISTICS: 

; LENGTH: 1172 amino acids 

; TYPE: amino acid 

STRANDEDNESS: single 

TOPOLOGY: linear 
MOLECULE TYPE: peptide 
US-08-313-288B-19 

Query Match 6.2%; Score 296.5; DB 1; Length 1172; 

Best Local Similarity 30.5%; Pred. No. 1.4e-18; 

Matches 78; Conservative 28; Mismatches 105; Indels 45; Gaps 9 

Qy 209 RQARIADTANYTCVAKNIVARRRSASAA-VIVYVNGGWSTWTEWSVCSASCGRGWQKRSR 267 

: : I I : I I : : I I II : : I I I I I : I I I I : I I I II 

Db 403 QRGRSCDVTSNTCLGPSIQTRACSLSKCDTRIRQDGGWSHWSPWSSCSVTCGVGNITRIR 4 62 

Qy 268 SCTNPAPLNGGAFCEGQNVQKTAC-ATLCPVDGSWSPWSKWSACGLDCT HWRSRECS 323 

I : I I I I I : I : II I I : I I I I I I I I I I I : I I : I I : 

Db 4 63 LCNSPVPQMGGKNCKGSGRETKACQGAPCPIDGRWSPWSPWSACTVTCAGGIRERTRVCN 522 

Qy 324 DPAPRNGGEECQGTDLDTRNCTSDLCVHSASGPEDVALYVGLIAVAVCLVLLLLVLILVY 383 

I I : I I : I I : : I I III II 
Db 523 SPEPQYGGKACVGDVQERQMCNKRSC PVDGCLSNPCFPGAQC 564 

Qy 384 CRKKEGLDSDVADSSILTSGFQPVSI— KPSKADNPHLLTIQPDLSTTTT TYQ 434 

I I I : I I I I : : : : I I : : I : I 

Db 565 SSFPDGS-WSCGFCPVGFLGNGTHCEDLDECALVPDICFSTSKVPRCVNTQP 615 

Qy 435 GSLC PRQDGPSP 446 

II I I I I 

Db 616 GFHCLPCPPRYRGNQP 631 



RESULT 8 
US-08-808-982-8 

Sequence 8, Application US/08808982 
Patent No. 5939271 
GENERAL INFORMATION: 

APPLICANT: Tessier-Lavigne, Marc 
APPLICANT: Leonardo, E. David 
APPLICANT: Hink, Lindsay 
APPLICANT: Masu, Masayuki 
APPLICANT: Kazuko, Keino-Masu 



TITLE OF INVENTION: Netrin Receptors 
NUMBER OF SEQUENCES: 8 
CORRESPONDENCE ADDRESS: 
; ADDRESSEE : SCIENCE & TECHNOLOGY LAW GROUP 

STREET: 268 BUSH STREET, SUITE 3200 

CITY: SAN FRANCISCO 

STATE: CALIFORNIA 

COUNTRY: USA 

ZIP : 94104 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/ 08/ 8 08 , 982 

FILING DATE: 
; CLASSIFICATION: 530 

ATTORNEY/AGENT INFORMATION: 
; NAME: OSMAN, RICHARD A 

REGISTRATION NUMBER: 36,621 

REFERENCE/DOCKET NUMBER: UC96-217 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (415) 343-4341 

TELEFAX : (415) 343-4342 
; INFORMATION FOR SEQ ID NO: 8: 
; SEQUENCE CHARACTERISTICS: 

; LENGTH: 102 amino acids 

; TYPE: amino acid 

; STRANDEDNESS: not relevant 

TOPOLOGY: not relevant 
MOLECULE TYPE: peptide 
US-08-808-982-8 



Query Match 6.1%; Score 294; DB 2; Length 102; 

Best Local Similarity 56.4%; Pred. No. 3.9e-20; 

Matches 57; Conservative 16; Mismatches 28; Indels 0; Gaps 0 

Qy 608 EEAPSHLYYCQLEASACWFTEQLGRFALVGEALSVAAAKRLKLLLFAPVACTSLEYNIR 667 

II : I I I I II: : I I I : I I : I : I I I I : I : I I I I II I I I : : I 
Db 2 EETLNTPCYXQLEPRACXILLDQLGTYVFTGESYSRSAVKRLQLAVFAPALCTSLEYSLR 61 

Qy 668 VTCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNL 708 

I I I I II I I I I I : : I I : III I : : I I : I I I I I I I II 
Db 62 VYCLEDTPVALKEVLELERTLGGYLVEEPKPLMFKDSYHNL 102 



RESULT 9 

US-09-306-902A-8 

; Sequence 8, Application US/09306902A 
; Patent No. 6277585 

GENERAL INFORMATION: 

APPLICANT: Tessier-Lavigne, Marc 
; Leonardo, E. David 

Hink, Lindsay 
; Masu, Masayuki 

Kazuko, Keino-Masu 



TITLE OF INVENTION: Netrin Receptors 
NUMBER OF SEQUENCES: 9 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: SCIENCE & TECHNOLOGY LAW GROUP 
STREET: 268 BUSH STREET, SUITE 3200 
CITY: SAN FRANCISCO 
STATE: CALIFORNIA 
COUNTRY: USA 
ZIP: 94104 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/306, 902A 
FILING DATE: 07-May-1999 
CLASSIFICATION: <Unknown> 
ATTORNEY/ AGENT INFORMATION: 
NAME: OSMAN, RICHARD A 
REGISTRATION NUMBER: 36,627 
REFERENCE/ DOCKET NUMBER: UC96-217 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (415) 343-4341 
TELEFAX: (415) 343-4342 
INFORMATION FOR SEQ ID NO: 8: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 102 amino acids 
TYPE: amino acid 
STRANDEDNESS: not relevant 
TOPOLOGY: not relevant 
MOLECULE TYPE: peptide 
SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
US-09-306-902A-8 

Query Match 6.1%; Score 294; DB 3; Length 102; 

Best Local Similarity 56.4%; Pred. No. 3.9e-20; 

Matches 57; Conservative 16; Mismatches 28; Indels 0; Gaps 0 

Qy 608 EEAPSHLYYCQLEASACWFTEQLGRFALVGEALSVAAAKRLKLLLFAPVACTSLEYNIR 667 

II : I I I I II: : I I I : I I : I : I I I I : I : I I I I I I I I I : : I 
Db 2 EETLNTPCYXQLEPRACXILLDQLGTYVFTGESYSRSAVKRLQLAVFAPALCTSLEYSLR 61 

Qy 668 VYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNL 7 08 

I I I I II I I I I I : : I I : III I : : I I : I I I I I I I I I 
Db 62 VYCLEDTPVALKEVLELERTLGGYLVEEPKPLMFKDSYHNL 102 



RESULT 10 
PCT-US93-01652-1 

Sequence 1, Application PC/TUS9301652 
GENERAL INFORMATION: 



APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 



Bouck, Noel P. 
Polverini, Peter J. 
Good, Deborah J. 
Frazier, William A. 



TITLE OF INVENTION: Method and Composition for 



TITLE OF INVENTION: Inhibiting Angiogenesis 
; NUMBER OF SEQUENCES: 12 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Tilton, Fallon, Lungmus & Chestnut 

STREET: 100 South Wacker Drive, Suite 960 

CITY: Chicago 

STATE: Illinois 

COUNTRY: USA 

ZIP: 60606-4002 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1,25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: PCT/US93/01652 
FILING DATE: 19930222 
CLASSIFICATION: 
PRIOR APPLICATION DATA: 
; APPLICATION NUMBER: US/07/841,656 

FILING DATE: 24-FEB-1992 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US/07/4 64,369 
FILING DATE: 12-JAN-1990 
ATTORNEY/ AGENT INFORMATION: 
; NAME : Fentress, Susan B. 

REGISTRATION NUMBER: 31,327 
REFERENCE/ DOCKET NUMBER: 92005-PCT 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: ( 312 ) -456-8000 
TELEFAX: ( 312 ) -456-7776 
; INFORMATION FOR SEQ ID NO: 1: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 239 amino acids 
TYPE: AMINO ACID 
STRANDEDNESS: unknown 
; TOPOLOGY: unknown 

MOLECULE TYPE: peptide 
PCT-US93-01652-1 

Query Match 5.6%; Score 268.5; DB 5; Length 239; 

Best Local Similarity 33.5%; Pred. No. 4.4e-17; 

Matches 52; Conservative 23; Mismatches 61; Indels 19; Gaps 4 

Qy 207 WRQARLADT AN YT CVAKN I VAR RRSASAAVIVYVNGGWSTWTEWSVCSASC 258 

: : : I I : I I : : I : I : I I I I I : I I I I : I 

Db 88 IQQRGRSCDSLNNRCEGSSVQTRTCHIQECDKRFKQ DGGWSHWSPWSSCSVTC 140 

Qy 259 GRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTAC-ATLCPVDGSWSPWSKWSACGLDC 314 

II I I I : I :| I III: : II I I :: I I I I I I I : I 

Db 141 GDGVITRIRLCNSPSPQMNGLPCEGEARETKACKKDACPINGGWGPWSPWDICSVTCGGG 200 

Qy 315 THWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLC 34 9 

I I I I :: I I I : I I : I I : : I I 
Db 201 VQKRSRLCNNPAPQFGGLDCVGDVTENQICNKQDC 235 



RESULT 11 
US-08-313-288B-20 

; Sequence 20, Application US/08313288B 

; Patent No. 5750502 

; GENERAL INFORMATION: 

; APPLICANT: Jessell, Thomas M. and Avihu Klar 

TITLE OF INVENTION: CLONING, EXPRESSION AND USES OF A 
TITLE OF INVENTION: NOVEL SECRETED PROTEIN, F-SPONDIN 
NUMBER OF SEQUENCES: 20 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Cooper & Dunham LLP 

STREET: 1185 Avenue of the Americas 

CITY: New York 
; STATE: New York 

COUNTRY: USA 

ZIP: 10036 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 
; OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/313, 28 8B 

FILING DATE: January 5, 1995 

CLASSIFICATION: 435 
ATTORNEY/AGENT INFORMATION: 

NAME: White, John P. 

REGISTRATION NUMBER: 28,678 

REFERENCE/ DOCKET NUMBER: 4 0028-A-PCT-US 
; TELECOMMUNICATION INFORMATION: 
; TELEPHONE: (212) 278-0400 

TELEFAX: (212) 391-0526 

TELEX : 

; INFORMATION FOR SEQ ID NO: 20: 
; SEQUENCE CHARACTERISTICS: 

; LENGTH: 1170 amino acids 

TYPE: amino acid 

STRANDEDNESS : single 

TOPOLOGY: linear 
MOLECULE TYPE: peptide 
US-08-313-288B-20 

Query Match 5.6%; Score 268.5; DB 1; Length 1170; 

Best Local Similarity 32.9%; Pred. No. 6.3e-16; 

Matches 51; Conservative 24; Mismatches 61; Indels 19; Gaps 4 

Qy 207 WRQARLADTANYTCVAKNIVAR RRSASAAVIVYVNGGWSTWTEWSVCSASC 258 

: : : I I : I I : : I : I : I I I I I : I I I I : I 

Db 399 IQQRGRSCDSLNNRCEGSSVQTRTCHIQECDKRFKQ DGGWSHWSPWSSCSVTC 451 

Qy 259 GRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTAC-ATLCPVDGSWSPWSKWSACGLDC 314 

I I I I I : I : I I III: : II I I :: I I I I I I I : I 

Db 452 GDGVITRIRLCNSPSPQMNGKPCEGEARETKACKKDACPINGGWGPWSPWDICSVTCGGG 511 

Qy 315 THWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLC 34 9 

I I I I : : I I : I I : : I I : : I I 
Db 512 VQKRSRLCNNPTPQFGGKDCVGDVTENQICNKQDC 546 



RESULT 12 
US-08-985-526-3 

Sequence 3, Application US/08985526 
Patent No. 6080728 
GENERAL INFORMATION: 

APPLICANT: Mixson, James A 

TITLE OF INVENTION: CARRIER: DNA COMPLEXES CONTAINING DNA 
TITLE OF INVENTION: ENCODING ANTI -ANGIOGENIC PEPTIDES AND THEIR USE IN 
GENE 

TITLE OF INVENTION: THERAPY 
NUMBER OF SEQUENCES: 4 3 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Connolly, Bove, Lodge, & Hutz 
STREET: 1220 Market Street, P.O. Box 2207 
CITY: Wilmington 
STATE: Delaware 
COUNTRY: U.S.A. 
ZIP: 19899 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/985,526 
FILING DATE: 
CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/608,845 
FILING DATE: 16-JUL-1996 
ATTORNEY/ AGENT INFORMATION: 

NAME: McMorrow Jr., Robert G 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (302) 658-9141 
TELEFAX: (302) 658-5613 
INFORMATION FOR SEQ ID NO: 3: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 441 amino acids 
TYPE: amino acid 
TOPOLOGY: linear 
US-08-985-526-3 

Query Match 5.2%; Score 249.5; DB 3; Length 441; 

Best Local Similarity 26.2%; Pred. No. 7.9e-15; 

Matches 88; Conservative 35; Mismatches 112; Indels 101; Gaps 16; 

Qy 75 FKCNGEW VRQVDHVIERSTDGSSGLPTM EVRINVSRQQ V 113 

I I : I I I I I I I I : I I I II : :: : 

Db 132 FKQDGGWSHWSPWSSCSVTCGDGVITRITLCNSPSPQMNGKPCEGEARETKACKKDACPI 191 

Qy 114 EKVFGLEEYWCQCVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPP 173 

s I II II:: I I : ! I I : I I I 

Db 192 NGGWGPWSPWDICSVTCGGGVQKRSRLCV DSRMTEENKELANELR RPP 239 



Qy 174 E G I P P AE VEW L RN E D - LVD P S L D P N VY I T REH S L WRQ ARLADT AN YT C VAKN I VA 228 



I I : : I I : I I : I : I I 

Db 24 0 LCYHNG VQYRNNEEWTVDSCTE CHCQNSVT 269 

Qy 229 RRRSASAAVIVYVNG GWSTWTEWSVCSASCGRGWQKRSRSC 269 

: I : : I I I I I : I I : I I I I I I I : I I I I 

Db 27 0 ICKKVSCPIMPCSNATVPDGECCPRCWPSDSADDGWSPWSEWTSCSTSCGNGIQQRGRSC 32 9 

Qy 27 0 TN P AP LN GGAFC E GQN VQ KT AC - AT L C PVDGSWSPWSKWSACGLDC THWRSRE 321 

: II I I I : I I I I I I I I I I I I : I : I I 

Db 330 DS LNNR — CEGSSVQTRTCHIQECDKRFKQDGGWSHWSPWSSCSVTCGDGVITRITL 384 

Qy 322 CSDPAPRNGGEECQGTDLDTRNCTSDLC-VHSASGP 356 

I : I : I : I : I : I : I : I I I : : II 
Db 385 CNSPSPQMNGKPCEGEARETKACKKDACPINGGWGP 420 



RESULT 13 
US-08-313-288B-15 

; Sequence 15, Application US/08313288B 
; Patent No. 5750502 
; GENERAL INFORMATION: 

; APPLICANT: Jessell, Thomas M. and Avihu Klar 

TITLE OF INVENTION: CLONING, EXPRESSION AND USES OF A 
TITLE OF INVENTION: NOVEL SECRETED PROTEIN, F-SPONDIN 
NUMBER OF SEQUENCES: 20 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Cooper & Dunham LLP 
; STREET: 1185 Avenue of the Americas 

CITY: New York 
; STATE: New York 

COUNTRY: USA 

ZIP: 10036 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 
; APPLICATION NUMBER: US/08/313, 288B 

FILING DATE: January 5, 1995 
; CLASSIFICATION: 4 35 

ATTORNEY/AGENT INFORMATION: 
NAME: White, John P. 
REGISTRATION NUMBER: 28,678 
REFERENCE/ DOCKET NUMBER: 4 0028-A-PCT-US 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (212) 278-0400 
TELEFAX: (212) 391-0526 
TELEX : 

; INFORMATION FOR SEQ ID NO: 15: 
; SEQUENCE CHARACTERISTICS: 

; LENGTH: 469 amino acids 

TYPE: amino acid 
STRANDEDNESS : single 
TOPOLOGY: linear 
MOLECULE TYPE: peptide 
HYPOTHETICAL: NO 



ANTI-SENSE: NO 
US-08-313-288B-15 



Query Match 5.1%; Score 243; DB 1; Length 469; 

Best Local Similarity 39.5%; Pred. No. 3.6e-14; 

Matches 45; Conservative 14; Mismatches 43; Indels 12; Gaps 4 

Qy 243 GGWSTWT EWS VC SAS CGRGWQKRS RS CTN PAPLNGGAFCEGQNVQKTACAT — LCPVDGS 300 

I I I I I I I I : I : I : I I : I : I I I II III : III : I I I : 
Db 137 GGWSGWGPWEPCSVTCSKGTRTRRRACNHPAPKCGG-HCPGQAQESEACDTQQVCPTHGA 195 

Qy 301 WSPWSKWSACGLDC THWRSRECSDPAP — RNGGEECQGTDLDTRNCT 345 

I : I I : I I I I I : I I I I : I : I I : I I I 

Db 196 WATWGPWTPCSASCHGGPHEPKETRSRKCSAPEPSQKPPGKPCPGLAYEQRRCT 24 9 



RESULT 14 
US-08-985-526-1 

Sequence 1, Application US/08985526 
Patent No. 6080728 
GENERAL INFORMATION: 

APPLICANT: Mixson, James A 

TITLE OF INVENTION: CARRIER: DNA COMPLEXES CONTAINING DNA 
TITLE OF INVENTION: ENCODING ANT I -ANGIOGENIC PEPTIDES AND THEIR USE IN 
GENE 

TITLE OF INVENTION: THERAPY 
NUMBER OF SEQUENCES: 43 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: Connolly, Bove, Lodge, & Hutz 
STREET: 1220 Market Street, P.O. Box 2207 
CITY: Wilmington 
STATE: Delaware 
COUNTRY: U.S.A. 
ZIP: 19899 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/985, 526 
FILING DATE: 
CLASSIFICATION: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US 08/608,845 
FILING DATE: 16-JUL-1996 
ATTORNEY/AGENT INFORMATION: 

NAME: McMorrow Jr., Robert G 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (302) 658-9141 
TELEFAX: (302) 658-5613 
INFORMATION FOR SEQ ID NO: 1: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 218 amino acids 
TYPE: amino acid 
TOPOLOGY: linear 
US-08-985-526-1 



Query Match 5.0%; Score 238; DB 3; Length 218; 

Best Local Similarity 39.3%; Pred. No. 3e-14; 

Matches 48; Conservative 16; Mismatches 44; Indels 14; Gaps 6 



Qy 244 GWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTAC-ATLC PVD 298 

I I I I : I I : I I I I I I I : I I I I : II I I I : I I I I I 

Db 81 GWSPWSEWTSCSTSCGNGIQQRGRSCDS LNNR— CEGSSVQTRTCHIQECDKRFKQD 135 

Qy 299 GSWSPWSKWSACGLDC THWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLC-VHSAS 354 

I I I I I I I : I : I I I : I : I : I : I : : I : I I I : : 

Db 136 GGWSHWSPWSSCSVTCGDGVITRITNLCSPSPQMNGKPCEGREAETKACKKDACPINGGW 195 

Qy 355 GP 356 

I I 

Db 196 GP 197 



RESULT 15 
US-09-540-245A-15 

Sequence 15, Application US/09540245A 
Patent No. 6270984 
GENERAL INFORMATION: 
APPLICANT: Goodman, Corey 
APPLICANT: Kid, Thomas 
APPLICANT: Brose, Katja 
APPLICANT: Tessier-Lavigne, Marc 

TITLE OF INVENTION: Modulating Robo : Ligand Interactions 
FILE REFERENCE: B98-031-3 

CURRENT APPLICATION NUMBER: US/09/540, 245A 
CURRENT FILING DATE: 2000-03-31 
PRIOR APPLICATION NUMBER: 60/065,544 
PRIOR FILING DATE: 1997-11-14 
PRIOR APPLICATION NUMBER: 60/081,057 
PRIOR FILING DATE: 1998-04-07 
NUMBER OF SEQ ID NOS : 20 
SOFTWARE: Patentln Ver. 2.0 
SEQ ID NO 15 
LENGTH: 1395 
TYPE: PRT 

ORGANISM: Drosophila melanogaster 
US-09-540-245A-15 

Query Match 4.9%; Score 234.5; DB 3; Length 1395; 

Best Local Similarity 20.7%; Pred. No. 1.5e-12; 

Matches 187; Conservative 104; Mismatches 273; Indels 341; Gaps 43 

Qy 4 RP GLW P ALLG I VLAAWL RG S GAQ Q SAT VAN P VP GA NPDLLPHFLVEPEDVYIVKN 58 

I I I I I : I I I : I : I : I : : I I I : : I I 

Db 28 RMWLLPAWLLLVLVA SNGLPAVRGQYQSPRIIEH PTDLWKKN 70 

Qy 59 KPVLLVCK — AVPATQI FFKCNGEWV RQVDHVIERSTDGSSGLPTMEVRINVSRQQV 113 

: I I I I I I : : I I I : I : : II I 
Db 71 EPATLNCKVEGKPEPTIEWFKDGEPVSTNEKKSHRVQFKDGALFFYRTM -QG 121 

Qy 114 EKVFGLEEYWCQCVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPP 173 

: I III III: I I : I : : I I I I : I II I : : : I I I I 



Db 122 KKEQDGGEYW— CVAKNRVGQAVSRHASLQIAVLRDDFRVEPKDTRVAKGETALLECGPP 17 9 

Qy 174 EGI P PAEVEWLRN EDLVDPSL— DPNVYITREHSLWRQARLADTANYTCVAKNIV 22 7 

MM : I::: :|| I I I :|:: I II |:|:|:| 

Db 180 KGI PEPTLIWI KDGVPLDDLKAMS FGAS SRVRI VDGGNLLI SNVEPI DEGN YKCI AQNLV 239 

Qy 22 8 ARRRSASAAVIVYVN GGWSTWTEWSVCSASCGRG WQK- 264 

I I : I : I I I | : | : | I : 

Db 24 0 GTRESSYAKLIVQVKPYFMKEPKDQVMLYGQTATF HCSVGGDPPPKVLWKKE 291 

Qy 265 RS RS CTN PAP LNGGAF- C EGQN — VQKTACATL 294 

: I : I I : I : I I I I : I I : I 
Db 292 EGNIPVSRARILHDEKSLEISNITPTDEGTYVCEAHNNVGQISARASLIVHAPPNFTKRP 351 

Qy 295 CPVDGSWSPWSKWSACGL DCTHWRSRECSDPAPRNGGEEC 334 

I I : I I : I : : : I I : I 
Db 352 SNKKVGLNGVVQLPCMASGNPPPSVFWTKEGVSTLMFPNSSHGRQYVAADGT L 4 04 

Qy 335 QGTDLDTRNCTSDLCVHSASGPEDVALYVGLIAVAVCLVLLLLVLILVYCRKKEGLDSDV 394 

III: III II I 
Db 405 QITDV RQEDEGYY VCSAF SV 424 

Qy 395 ADS S I LT S GFQ PVS I KP S KADN — PHLLTIQPDLSTTTTTYQGSL CPRQDGPSPKF 448 

III: I : : | | I : : I I I : I I : I |||: 

Db 425 VDSSTVR VFLQVSSVDERPPPIIQIGP ANQTLPKGSVATLPCRATGNPSPRI 476 

Qy 449 Q-LTNGHLLSPLGGGRHTLHHSSPTSEAEEFVSRLSTQNYFRSLPRGTSNMTYGTFNFLG 507 

: : I I : I I : : : I : : I I III:: 
Db 477 KWFHDGHAVQ — AGNRYSIIQGSSLRVDDLQLSDSGTYTCTASGERGETS 524 

Qy 508 GRLMI PNTGI SLLI PPDAI PRGKI YEI YLTLHKPEDVRLPLAGCQTLLS PIVSCGPPGVL 567 

: 11:11 II I I II 

Db 525 WAATLTVEKPGSTSLHRAA DPSTYPAPPGT- 554 

Qy 568 LTRPVI LAMDHCGEPS PDSWSLR-LKKQSCEGS WEDVLH-L 606 

I : I : I I I I I I I I : I I : 

Db 555 PKVLNV S RT S I S LRWAK S QEK P GAVG P 1 1 G YT VE Y F S P D LQT GW I VAAH RV 605 

Qy 607 GEEAPSHLYYCQLEASACYVF TEQLGRFALVGEALSVA- 644 

I : : : I III I : I III 

Db 606 GD TQVTISGLTPGTSYVFLVRAENTQGISVPSGLSNVI KTIEADFDAASANDLSAAR 662 

Qy 645 AAKRLKLLLFAPVACTSLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHF 701 

I :: I : : : ::: : I I : I I I : I I I : 

Db 663 TLLTGKSVELIDASAINASAVRLE WMLHVSAD EKYVEGLRI HY 705 

Qy 702 KDS YHNLRL SIHDVPSSLWKSKLLVS 727 

II: II::: : I I I : I I : : 
Db 706 KDAS VP S AQ YH S I T VMDAS AE S FWGNLKK YT K YE FFLT P FFET I EGQ P SN SKTALT 762 

Qy 728 YQEIP 732 

I : : : I 

Db 763 YEDVP 7 67 



Search completed: July 12, 2004 , 23 :0-1:56 
Job time : 28 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 



Run on: July 12, 2004, 22:36:10 ; Search time 33 Seconds 

(without alignments) 
2617.575 Million cell updates/sec 

US-10-624-932-2 
Perfect score: 4791 

1 MAVRPGLWPALLGIVLAAWL AVAGL GQ P D AGL FT VS E AE C 8 98 



Title: 
Perf ed 
Sequence : 

Scoring table: 



BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched: 283366 seqs, 96191526 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



283366 



Database 



PIRJ78:* 
pirl: * 
pir2:* 
pir3 : * 
pir4 : * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
T32541 

unc-5 protein - Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

C;Date: 29-Oct-1999 #sequence_revision 29-Oct-1999 #text_change 31-Jan-2000 
C; Accession: T32541 
R; Latreille, P . 

submitted to the EMBL Data Library, December 1997 

A; Description : The sequence of C. elegans cosmid B0273. 

A; Reference number: Z21187 

A;Accession: T32541 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: DNA 
A; Residues: 1-919 <LAT> 

A; Cross-references : EMBL : AF0 36698; PIDN : AAB88355 . 1 ; GSPDB: GN00022 ; CESP : B0273 . 4a 
A; Experimental source: strain Bristol N2; clone B0273 
C; Genetics : 

A;Gene: unc-5; CESP : B0273 . 4a 
A;Map position: 4 

A;Introns: 41/3; 108/1; 142/3; 201/1; 323/2; 553/1; 858/3 



C; Super family : unc-5 protein; immunoglobulin homology; SH3 homology; 
thrombospondin type 1 repeat homology 



Query Match 20.4%; Score 977; DB 2; Length 919; 

Best Local Similarity 28.7%; Pred. No. 1.2e-62; 

Matches 265; Conservative 168; Mismatches 379; Indels 110; Gaps 31; 

Qy 49 EPEDVYIVKNKPVLLVCKAVPATQIFFKCNGEWVRQVDHVIER— STDGSSGLPTMEVRI 106 

: I : I : : : I I I : I I : I I I : I : I I : : I : I II: I I : I I : : : : 
Db 9 QPKSGYVIRNKPLRLQCRANHATKIRYKCSSKWID--DSRIEKLIGTDSTSGVGYIDASV 66 

Qy 107 NVS RQQVEKVFGLEE YWCQCVAWS S S G TTKSQKAYIRIARLRKNFEQEPLAKEVS 161 

: : I I I : : : : I I I I II I I : : I : I I : I : I : I : I 

Db 67 D I S RI DVDT S GHVDAFQCQC YA SGDDDQD VVASDVATVHLAYMRKHFLKS PVAQRVQ 123 

Qy 162 LEQGIVLPCRPPEGIPPAEVEWLRNEDLVDPSLDPNVYITREHSLWRQ7VRLADTANYTC 221 

: I I I : I I I I I : I : : : I I III : I I : : I I I : I : I I I I 

Db 124 EGTTLQLPCQAPESDPKAELTWYKDGVWQP — DANVI RASDGSLIMSAARLSDSGNYTC 181 

Qy 222 VAKNIVARRRSAST^AVIVTVNGGWSTWTEW-SVCSASCG RGWQKR 265 

II: I : : I : I I : I I I I I : I I I I : I 

Db 182 EATNVANSRKTDPVEVQIYVDGGWSEWSPWIGTCHVDCPLLRQHAHRIRDPHDVLPHQRR 241 

Qy 266 SRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGSWSPWSKWSACGLDCTHWRSRECSDP 325 

: I : I I I I I I I I : I : I : : I I : I I I I I I I I I I I : I : I I : I 

Db 242 TRTCNNPAPLNDGEYCKGEEEMTRSCKVPCKLDGGWSSWSDWSACSSSCHRYRTRACTVP 301 

Qy 326 APRNGGEECQGTDLDTRNCTSDLCVHSASG— PEDVALYVGLIAVAVCLVLLLLVLILVY 383 

I I I I : I I I I I : I : I I : I | | : | : : : : : I : I : 

Db 302 P PMNGGQ PC FGDDLMTQEC PAQLCTADS S RI VI S DTAVYGSVAS I FI VAS FI LAI LAMFC 361 

Qy 384 CR KKEGLDSDVADSSILTSGFQPVSIKPSKADNPHLLTI 422 

| : | : : : | : | : : | : : : : | : 

Db 362 CKRGNSKKSKPLKPQKMNSEKAGGIYYS EPPGVRRLLLEHQHGTLLGEKISSCSQYF 418 

Qy 423 -QPDLSTTTT TYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSSPT-SE 474 

I I : I I : I II I I I I I I I I : 

Db 419 EPPPLPHSTTLRSGKSAFSGYSSTRNAGSRAALIQECSSSSSGSGGKRTMLRTSSSNCSD 478 

Qy 475 AEEFVSRLSTQNYFRSLPRGTS-NMTYGTFNFLGGRLMIPNTGISLLIPPDAIPRGKIYE 533 

: : : I III: : I I I : : I I : : I I : I : 

Db 479 DDNYATLYDYMEDKSVLGLDTSQNIVAAQIDSNGARLSLSKSGARLIVPELAVEGEKM-- 536 

Qy 534 IYLTLHKPEDVRLPLAGCQTLLSPIVSCGPPGV LLTRPVI LAMDHCGEPSP-D 585 

: I I : : I : : I I I : : I I : I I I I : : : II II 

Db 537 LYLAVSDTLTDQPHLKPIESALSPVIVIGQCDVSMSAHDNILRRPVWSFRHCASTFPRD 596 

Qy 586 SWSLRLKKQSCEGS-WEDVLHLGEEAPSHLYYCQLEASA CYVFTEQLGRFAL 636 

: I I : I I I I : : : I I I : : I I I : I I I I I 

Db 597 NWQFTL--YADEGSGWQKAVTIGEENLNTNMFVQFEQPGKKNDGFGWCHVMTYSLARLML 654 

Qy 637 VGEAL — S VAAAKRLKLLLFAPVACTSLE — YNIRVYCLHDTHD7VLKEWQLEKQLGGQL 692 

I I : : I I I I : I : I I : : : : I I I I : : I I : : I : I I : I 

Db 655 AGHPRRNSLSAAKRVHLAVFGPTEMSAYRRPFELRVYCVPETGAAMESVWKQED— GSRL 712 

Qy 693 IQEPR — VLHFKDSYHNLRLSIHDV-PSSLWKSKLLVSYQEIPFYHIWNGTQRYLHCTFT 749 

: I : I : I I I : I I I I : I I : I III: 



Db 



713 LCESNDFILNEKG NLCICIEDVT PGFSCDGPEVVEI SETQHRFV AQNGLHCSLK 766 



Qy 



750 LERVSPSTSDLACKLWVWQVEGDGQSFSINFNITKDTRFAELLALESEAGVPALVGPSAF 809 
: I : : : I : I : : : : : : : I I I : I 



Db 




Qy 



Db 



810 KIPFLIRQKIISSLDPPCRRGADWRTLAQKLHLDSHLSFFASKP — SPTAMILNLWEARH 867 

: : I I : : : : III : I I I I I : I I I I : I I I I I I I I I : : : I : I I I I 
821 RLPFGVKDELARLLDMPNESHSDWRGLAKKLHYDRYLQFFASFPDCSPTSLLLDLWEASS 880 



Qy 



868 FPNGN-LSQLAAAVAGLGQPDA 888 



Db 



881 SGSARAVPDLLQTLRVMGRPDA 902 



RESULT 2 
B44294 

unc-5 protein, long form - Caenorhabditis elegans 
N; Contains: unc-5 protein, short form 
C; Species: Caenorhabditis elegans 

C;Date: 30-Apr-1993 #sequence_revision 28-Jul-1995 #text_change 05-Nov-1999 
C;Accession: B44294; T32540; A44294 

R; Leung-Hagesteijn, C; Spence, A.M.; Stern, B.D.; Zhou, Y.; Su, M.W. ; 
Hedgecock, E.M.; Culotti, J.G. 
Cell 71, 289-299, 1992 

A; Title: UNC-5, a transmembrane protein with immunoglobulin and thrombospondin 

type 1 domains, guides cell and pioneer axon migrations in C. elegans. 

A;Reference number: A44294; MUID : 93046629 ; PMID:1384987 

A; Contents: variety Bergerac 

A; Access ion: B44294 

A;Molecule type: DNA 

A; Residues: 1-947 <LEU> 

A;Cross-references: GB:S47168; NID:g258527; PIDN : AAB23867 . 1 ; PID:g258529 
A;Note: sequence extracted f rom NCBI backbone (NCBIN : 116668 , NCBIN: 116670, 
NCBIN: 116672, NCBIN : 116674 , NCBIN : 116676, NCBIN : 116678 , NCBIN : 11668 0 , 
NCBIN: 116682, NCBIN : 116685, NCBIP : 118648 ) 

A;Note: authors translated the codon CTA for residue 642 as Val; sequence shown 
follows the authors 1 translation 

A;Note: mRNA lacking the first exon is equally prevalent 
R;Latreille, P. 

submitted to the EMBL Data Library, December 1997 

A; Description: The sequence of C. elegans cosmid B0273. 

A; Reference number: Z21187 

A;Accession: T32540 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: DNA 
A; Residues: 1-947 <LAT> 

A; Cross-references : EMBL :AF03 6698 ; PIDN: AAB88356 . 1; GSPDB : GN00022 ; CESP : B0273 . 4b 

A; Experimental source: strain Bristol N2; clone B0273 

C; Genetics : 

A; Gene: unc-5 

A; Map position: 4 

A;Introns: 28/1; 69/3; 136/1; 170/3; 229/1; 351/2; 581/1; 886/3 
C; Function : 

A; Description: required for guidance of pioneering axons and cells migrating 
dorsally along the body wall; proposed to be a receptor on the surface of the 
motile cells 



C;Superfamily: unc-5 protein; immunoglobulin homology; SH3 homology; 
thrombospondin type 1 repeat homology 

C; Keywords: alternative splicing; duplication; glycoprotein; receptor; 
transmembrane protein 

F; 30- 94 7 /Product : unc-5 protein, short form #status predicted <ALT> 
F; 46-116/Domain: immunoglobulin homology <IM1> 
F; 153-211/Domain: immunoglobulin homology <IM2> 

F;229-300/Domain: thrombospondin type 1 repeat homology #status atypical <THRl> 
F;301-354/Domain: thrombospondin type 1 repeat homology <THR2> 
F; 3 65-3 90/ Domain : transmembrane #status predicted <TMM> 
F;512-559/Domain: SH3 homology <SH3> 

F; 53-114, 65-112, 160-209/Disulfide bonds: #status predicted 
F;206/Binding site: carbohydrate (Asn) (covalent) #status predicted 

Query Match 20.4%; Score 977; DB 1; Length 947; 

Best Local Similarity 28.7%; Pred. No. 1.2e-62; 

Matches 265; Conservative 168; Mismatches 379; Indels 110; Gaps 31; 

Qy 4 9 E P E DVY I VKN K P VL L VC KAVP AT QIFFKCNG E WVRQ VD H VI E R — STDGSSGLPTMEVRI 106 

: I : I : : : I I I : I I : I I I : I : I I : : I : I I I : I I : I I : : : : 
Db 37 QPKSGYVIRNKPLRLQCRANHATKIRYKCSSKWID— DSRIEKLIGTDSTSGVGYIDASV 94 

Qy 107 NVS RQQVEKVFGLEE YWCQCVAWS S S G TTKSQKAYIRIARLRKNFEQEPLAKEVS 161 

: : I I I : : : : I I I I II I I : : I : I I : I : I : I : I 

Db 95 D I S RI DVDT S GHVDAFQCQC YA SGDDDQDWASD VATVHLAYMRKHFLKS PVAQRVQ 151 

Qy 162 LEQGIVLPCRPPEGIPPAEVEWLRNEDLVDPSLDP^^V!IrITREHSLVVRQARLADTANYTC 221 

: I I I : I I I I I : I : : : I I III : I I : : I I I : I : I I I I 

Db 152 EGTTLQLPCQAPESDPKAELTWYKDGWVQP — DANVI RASDGSLIMSAARLSDSGNYTC 209 

Qy 222 VAKNIVARRRSASAAVIVYVNGGWSTWTEW-SVCSASCG RGWQKR 2 65 

II: I :: I : I I : I I I I I : I I I I : I 

Db 210 EATNVANSRKTDPVEVQIYVDGGWSEWSPWIGTCHVDCPLLRQHT^HRIRDPHDVLPHQRR 269 

Qy 266 SRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGSWSPWSKWSACGLDCTHWRSRECSDP 325 

: I : I I I I I I I I : I : I : : I I : I I I I I I I I I I I : I : I I : I 

Db 27 0 TRTCNNPAPLNDGEYCKGEEEMTRSCKVPCKLDGGWSSWSDWSACSSSCHRYRTRACTVP 32 9 

Qy 326 APRNGGEECQGTDLDTRNCTSDLCVHSASG — PEDVALYVGLIAVAVCLVLLLLVLILVY 383 

I I I I : I I I I I : I : I I : I I hi : : : : : I : I : 

Db 330 P PMNGGQPCFGDDLMTQECPAQLCTADS S RI VI S DTAVYGSVAS I FI VAS FI LAI LAMFC 389 

Qy 384 CR KKEGLDSDVADSSILTSGFQPVSIKPSKADNPHLLTI 422 

I: | : ::|: | : :| :: :: | : 
Db 390 CKRGNSKKSKPLKPQKMNSEKAGGIYYS EPPGVRRLLLEHQHGTLLGEKISSCSQYF 44 6 

Qy 423 -QPDLSTTTT TYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSSPT-SE 474 

I I : I I : I II I II I | I | | : 

Db 447 EPPPLPHSTTLRSGKSAFSGYSSTRNAGSRAALIQECSSSSSGSGGKRTMLRTSSSNCSD 506 

Qy 475 AEEFVSRLSTQNYFRSLPRGTS-NMTYGTFNFLGGRLMIPNTGISLLIPPDAIPRGKIYE 533 

: : : I III: : I I I : : I I : : I I : I : 

Db 507 DDNYATLYDYMEDKSVLGLDTSQNIVAAQIDSNGARLSLSKSGARLIVPELAVEGEKM— 564 

Qy 534 IYLTLHKPEDVRLPLAGCQTLLSPIVSCGPPGV LLTRPVILAMDHCGEPSP-D 585 

: I I : : I : : I I I : : I I : | I I I : : : | | | | 

Db 565 LYLAVSDTLTDQPHLKPIESALSPVIVIGQCDVSMSAHDNILRRPVWSFRHCASTFPRD 624 



Qy 586 SWSLRLKKQSCEGS-WEDVLHLGEEAPSHLYYCQLEASA CYVFTEQLGRFAL 636 

: I I : I I I I : : : I I I : : I I I : I I I I I 

Db 62 5 NWQFTL — YADEGSGWQKAVTIGEENLNTNMFVQFEQPGKKNDGFGWCHVMTYSLARLML 682 

Qy 637 VGEAL — SVAAAKRLKLLLFAPVACTSLE — YNIRVYCLHDTHDALKEWQLEKQLGGQL 692 

I I : : I I I I : I : I I : : : : I I I I : : I I : : I : I I : I 

Db 683 AGHPRRNSLSAAKRVHLAVFGPTEMSAYRRPFELRWCVPETGAAMESVWKQED — GSRL 740 

Qy 693 IQEPR — VLHFKDSYHNLRLSIHDV-PSSLWKSKLLVSYQEIPFYHIWNGTQRYLHCTFT 74 9 

: I : I : I I I : I I I I : I I : I III: 
Db 741 LCESNDFILNEKG NLCICIEDVIPGFSCDGPEWEISETQHRFV AQNGLHCSLK 794 

Qy 750 L E RVS P S T S DLAC KLWVWQVEGDGQ SFSINFNITKDT RFAE L LALE S EAGVPALVGP S AF 809 

: I : : : I : I : : : : : : : I I I : I 
Db 795 FRPKEINGSQFSTRVIVYQKASSTEPMVM— EVSNEPELYDATSEEREKGSVCV EF 84 8 

Qy 810 KI PFLI RQKI I S S LDPPCRRGADWRTLAQKLHLDSHLS FFASKP — S PTAMI LNLWEARH 867 

: : I I : : : : III : I I I I I : I I I I : I I I I I I I I I : : : I : I I I I 

Db 849 RLP FGVKDELARLLDMPNESHS DWRGLAKKLHYDRYLQFFAS FPDCS PTSLLLDLWEAS S 908 

Qy 868 FPNGN-LSQLAAAVAGLGQPDA 888 

: : I : : I : I I I 
Db 909 SGSARAVPDLLQTLRVMGRPDA 930 



RESULT 3 
T00026 

brain-specific angiogenesis inhibitor 1 - human 
N;Alternate names: BAI1 protein 
C; Species: Homo sapiens (man) 

C;Date: 22-Jan-1999 #sequence_revision 22-Jan-1999 #text_change 12-Feb-1999 
C;Accession: T00026 

R;Nishimori, H.; Shiratsuchi, T . ; Urano, T.; Kimura, Y. ; Kiyono, K . ; Tatsumi, 
K. ; Yoshida, S.; Ono f M. ; Kuwano, M. ; Nakamura, Y. 
submitted to the EMBL Data Library, June 1997 
A; Reference number: Z14064 
A; Accession: TO 002 6 

A; Status: translated from GB/EMBL/DDBJ 
A;Molecule type: mRNA 
A;Residues: 1-1584 <NIS> 

A; Cross-references: EMBL : AB005297 ; NID: dll75078 ; PID:dl024528 

A; Experimental source: brain 

C; Genetics : 

A; Gene: GDB : BAI 1 

A/Cross-references: GDB: 9838088; OMIM: 602682 
A;Map position: 8q24-8q24 

C; Super family : thrombospondin type 1 repeat homology 

F; 408-4 62 /Domain : thrombospondin type 1 repeat homology <THR3> 

Query Match 6.2%; Score 298.5; DB 2; Length 1584; 

Best Local Similarity 33.5%; Pred. No. 4.4e-13; 

Matches 78; Conservative 35; Mismatches 91; Indels 29; Gaps 11; 

Qy 124 CQCVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAEVEW 183 

I I : I I I : : I I : : I I I : I I I I I I I I I 

Db 309 CNREACGPAGRTSSRSQSLRSTDARR REELGDEL QQFGFPA-PQTGDPAAE-EW 360 



Qy 



184 



LRNEDLVDPSLDPNVYITREHSLWRQARLADTANYTCVAKNIVARRRSASAAVIVYVNG 243 



Db 



361 



— SPWSVCSSTCGEGWQTR TRFCVS S S YSTQCSGPLREQRLCNNSAVCPVHG 410 



Qy 



244 



GW S TWT EWS VC S AS C GRGWQKRS RS CTN PAPLNGGAFCEGQN VQKT AC- AT LC P VDG 29 9 




Db 



411 



AWDEWSPWSLCSSTCGRGFRDRTRTCR— PPQFGGNPCEGPEKQTKFCNIALCPGRAVDG 468 



Qy 



300 



SWSPWSKWSACGLDCT HWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLC 34 9 

: I : I I I I I I I : I : I I I : I : I I I I I I : : I I : I I 

NWNEWSSWSACSASCSQGRQQRTRECNGPS — YGGAECQGHWVETRDCFLQQC 519 



Db 



469 



RESULT 4 
TSHUP2 

thrombospondin 2 precursor - human 
C; Species: Homo sapiens (man) 

C;Date: 19-May-1995 #sequence_revision 03-Aug-1995 #text_change 13-Aug-1999 
C;Accession: A47379; A42173 
R; LaBell, T.L.; Byers, P.H. 
Genomics 17, 225-229, 1993 

A; Title: Sequence and characterization of the complete human thrombospondin 2 

cDNA: potential regulatory role for the 3 1 untranslated region. 

A; Reference number: A47379; MUID : 94010892 ; PMID: 8406456 

A;Accession: A47379 

A;Molecule type: mRNA 

A; Residues: 1-1172 <LAB> 

A; Cross-references : GB:L12350; NID:g307505; PIDN : AAA03703 . 1 ; PID:g307506 
R; LaBell, T.L.; Milewicz, D.J.; Disteche, CM.; Byers, P.H. 
Genomics 12, 421-429, 1992 

A;Title: Thrombospondin II: partial cDNA sequence, chromosome location, and 

expression of a second member of the thrombospondin gene family in humans. 

A; Reference number: A42173; MUID: 92217961; PMID: 1559694 

A;Accession: A42173 

A; Molecule type: mRNA 

A;Residues: 560-1172 <LA2> 

A; Cross -references: GB:M81339 

A; Experimental source: fibroblast 

A;Note: sequence extracted f rom NCBI backbone (NCBIN : 95091 , NCBIP:95096) 
C; Genetics : 

A; Gene: GDB : THBS2 ; TSP2 

A; Cross-references: GDB: 128789; OMIM: 188061 
A; Map position: 6q27-6q27 
C; Complex: homotrimer, disulfide linked 
C; Function: 

A; Description : participates in cell migration and adhesion, and in platelet 
aggregation 

C; Super family : thrombospondin 1; EGF homology; thrombospondin type 1 repeat 
homology; von Willebrand factor type C repeat homology 
C; Keywords: beta-hydroxyasparagine; calcium binding; cell adhesion; 
glycoprotein; trimer 

F; 1-18/Domain : signal sequence ftstatus predicted <SIG> 

F; 19-1172/Product : thrombospondin 2 #status predicted <MAT> 

F; 3 19- 37 7 /Domain : von Willebrand factor type C repeat homology <VWC> 

F;380-431/Domain: thrombospondin type 1 repeat homology <THR1> 

F;436-492/Domain: thrombospondin type 1 repeat homology <THR2> 



F; 4 93-54 9/ Domain : thrombospondin type 1 repeat homology <THR3> 
F;553-588/Domain: EGF homology <EGF1> 
F; 652-691/Domain: EGF homology <EGF> 
F;928-930/Region: cell attachment (R-G-D) motif 

F;151, 316, 330,457,584, 710, 1069/Binding site: carbohydrate (Asn) (covalent) 
(fstatus predicted 

F; 167-226/Disulf ide bonds: #status predicted 
F;266,270/Disulfide bonds: interchain #status predicted 

F; 612/Modified site: erythro-beta-hydroxyasparagine (Asn) #status predicted 

'Query Match 6.2%; Score 296.5; DB 1; Length 1172; 

Best Local Similarity 30.5%; Pred. No. 4.1e-13; 

Matches 78; Conservative 28; Mismatches 105; Indels 45; Gaps 9 

Qy 209 RQARLADTANYTCVAKNIVARRRSASAA-VIVTWGGWSTWTEWSVCSASCGRGWQKRSR 267 

: : I I : I I : : I I II : : I I I I I : I I I I : I I I II 

Db 403 QRGRSCDVTSNTCLGPSIQTRACSLSKCDTRIRQDGGWSHWSPWSSCSVTCGVGNITRIR 462 

Qy 268 SCTNPAPLNGGAFCEGQNVQKTAC-ATLCPVDGSWSPWSKWSACGLDCT HWRSRECS 323 

I : I I II I : I : I I I I : I I I I I I I I I I I : I I : I I : 

Db 463 LCNSPVPQMGGKNCKGSGRETKACQGAPCPIDGRWSPWSPWSACTVTCAGGIRERTRVCN 522 

Qy 324 DPAPRNGGEECQGTDLDTRNCTSDLCVHSASGPEDVALYVGLIAVAVCLVLLLLVLILVY 383 

I I : I I : I I : : I I III II 
Db 523 SPEPQYGGKACVGDVQERQMCNKRSC PVDGCLSNPCFPGAQC 564 

Qy 384 CRKKEGLDSDVADSSILTSGFQPVSI — KPSKADNPHLLTIQPDLSTTTT TYQ 434 

I I I : I I I I : : : : I I : : I : I 

Db 565 SSFPDGS-WSCGFCPVGFLGNGTHCEDLDECALVPDICFSTSKVPRCVNTQP 615 

Qy 435 GSLC PRQDGPSP 446 

II I I I I 

Db 616 GFHCLPCPPRYRGNQP 631 



RESULT 5 
JC5928 

semaphorin F precursor - human 
C; Species: Homo sapiens (man) 

C;Date: 10-Apr-1998 #sequence_revision 08-May-1998 #text_change 17-Nov~2000 
C; Accession: JC5928 

R; Simmons, A.D.; Pueschel, A.W. ; McPherson, J.D.; Overhauser, J.; Lovett, M. 
Biochem. Biophys . Res. Commun. 242, 685-691, 1998 

A; Title: Molecular cloning and mapping of human semaphorin F from the Cri-du- 
chat candidate interval. 

A;Reference number: JC5928; MUID: 98125554 ; PMID: 9464278 
A;Accession: JC5928 

A; Status: nucleic acid sequence not shown 
A;Molecule type: mRNA 
A; Residues: 1-1074 <SIM> 

A; Cross-references: GB:U52840; NID : g2772583 ; PIDN : AAC09473 . 1 ; PID:g2772584 
A; Experimental source: brain 

C; Comment: This protein disrupts normal brain development and leads to some of 
the features of Cri-du-chat. 
C; Genetics : 
A; Gene: semaf 

C; Superfamily : human semaphorin F; thrombospondin type 1 repeat homology 



F; 1-20/Domain: signal sequence ((status predicted <SIG> 

F; 50-533/Domain: semaphorin #status predicted <SEM> 

F; 8 4 0-8 9 6/ Domain: thrombospondin type 1 repeat homology <THR3> 

F; 971-993/Domain: transmembrane istatus predicted <TMM> 

Query Match 6.1%; Score 293; DB 2; Length 1074; 

Best Local Similarity 45.8%; Pred. No. 6.6e-13; 

Matches 54; Conservative 11; Mismatches 49; Indels 4; Gaps 2 

Qy 241 VNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATL-CPVDG 299 

I I I I I I I I I I I 111:111111 II II::: I I I I I I I I 
Db 783 VNGAWSAWTSWSQCSRDCSRGIRNRKRVCNNPEPKYGGMPCLGPSLEYQECNTLPCPVDG 842 

Qy 300 SWSPWSKWSACGLDC THWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSAS 354 

II I I I : I I : I : I I I : I I I I I : I I : I : I I I 

Db 843 VWSCWSPWTKCSATCGGGHYMRTRSCSNPAPAYGGDICLGLHTEEALCNTQPCPESWS 900 



RESULT 6 
A42587 

thrombospondin 2 precursor - mouse 
C; Species: Mus musculus (house mouse) 

C;Date: 04-Mar-1993 #sequence_revision 18-Nov-1994 #text_change 20-Aug-1999 
C;Accession: A42587; A39851 

R;Laherty, CD.; O'Rourke, K. ; Wolf, F.W. ; Katz, R. ; Seldin, M.F.; Dixit, V.M. 
J. Biol. Chem. 267, 3274-3281, 1992 

A; Title: Characterization of mouse thrombospondin 2 sequence and expression 
during cell growth and development. 

A; Reference number: A42587; MUID : 92147683 ; PMID: 1371115 
A; Accession: A42587 

A; Status: preliminary; not compared with conceptual translation 
A;Molecule type: nucleic acid 
A; Residues: 1-1172 <LAH> 

A;Cross-references : GB:L07803; GB:M87275; NID:g340421; PIDN : AAA53064 . 1 ; 
PID:g567241 

A; Note: sequence extracted from NCBI backbone (NCBIP : 81502 ) 

R;Bornstein, P.; O'Rourke, K. ; Wikstrom, K. ; Wolf, F.W. ; Katz, R. ; Li, P.; 
Dixit, V.M. 

J. Biol. Chem. 266, 12821-12824, 1991 

A;Title: A second, expressed thrombospondin gene (Thbs2) exists in the mouse 
genome . 

A; Reference number: A39851; MUID : 91302287 ; PMID: 1712771 
A; Access ion: A39851 
A; Status: preliminary 
A; Molecule type: mRNA 
A; Residues: 1-873 <BOR> 

A;Cross-references : GB:M64866; NID:g201994; PIDN: AAA4 0432 . 1; PID:g201995 
C; Superf amily: thrombospondin 1; EGF homology; thrombospondin type 1 repeat 
homology; von Willebrand factor type C repeat homology 
C; Keywords: calcium binding; glycoprotein 

F; 3 19-37 7 /Domain : von Willebrand factor type C repeat homology <VWC> 
F; 380-4 31/Domain : thrombospondin type 1 repeat homology <THR1> 
F; 436-4 92/Domain: thrombospondin type 1 repeat homology <THR2> 
F; 4 93-54 9/Domain: thrombospondin type 1 repeat homology <THR3> 
F;553-588/Domain: EGF homology <EGF1> 
F; 652- 691 /Domain : EGF homology <EGF> 



Query Match 6.1%; Score 293; DB 2; Length 1172; 

Best Local Similarity 38.0%; Pred. No. 7.4e-13; 

Matches 60; Conservative 22; Mismatches 66; Indels 10; Gaps 



5; 



Qy 



2 09 RQ ARLADT AN YT C VAKN I VARRR S - AS AAVI VYVN G GW S T WT EW S VC S AS C GRGWQ KR S R 267 
: : I I : I I : : I I I : I I I I I I : I I I I : I I I II 




Db 



Qy 



2 68 S CTNPAP LNGGAFCEGQNVQKTAC- ATLC PVDGSWS PWS KWSACGLDCT HWRSRECS 323 



Db 



4 63 LCNSPVPQMGGKNCKGSGRETKPCQRDPCPIDGRWSPWSPWSACTVTCAGGIRERSRVCN 522 



Qy 



Db 



324 DPAPRNGGEECQG— TD LDTRNCTSDLCVHSASGP 356 

I I : I |:: I I I : : I : I II:: I 
523 SPEPQYGGKDCVGDVTEHQMCNKRSCPIDGCLSNPCFP 560 



RESULT 7 
T18856 



angiogenesis inhibitor homolog - Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

C;Date: 15-Oct-1999 #sequence_revision 15-Oct-1999 #text_change 18-Feb-2000 
C;Accession: T18856; T24653 
R;McMurray, A. 

submitted to the EMBL Data Library, July 1995 
A; Reference number: Z19031 
A; Accession: T18856 

A; Status: preliminary; translated from GB/EMBL/DDBJ 

A;Molecule type: DNA 

A; Residues: 1-1444 <WIL> 

A; Cross-references: EMBL:Z50004; PIDN : CAA90293 . 1 ; GSPDB : GN0002 8 ; CESP:C02B4.1 
A; Experimental source: clone C02B4 
R;McMurray, A. 

submitted to the EMBL Data Library, July 1995 
A;Reference number: Z19917 
A; Accession: T24653 

A; Status: preliminary; translated from GB/EMBL/DDBJ 

A;Molecule type: DNA 

A; Residues: 1-1444 <WI2> 

A; Cross-references: EMBL:Z50006; PIDN : CAA90302 . 1 ; GSPDB : GN00028 ; CESP:C02B4.1 

A; Experimental source: clone T07C5 

C; Genetics : 

A; Gene: CESP:C02B4.1 

A; Map position: X 

A;Introns: 25/3; 70/3; 96/3; 139/3; 187/1; 234/2; 282/3; 376/2; 422/2; 478/3; 
509/3; 566/2; 625/1; 696/2; 786/3; 812/2; 878/3; 971/1; 1007/3; 1067/1; 1099/3; 
1180/3; 1273/2; 1305/1; 1363/1; 1388/2 

Query Match 5.8%; Score 276; DB 2; Length 1444; 

Best Local Similarity 27.2%; Pred. No. 1.7e-ll; 

Matches 73; Conservative 28; Mismatches 97; Indels 70; Gaps 12; 

Qy 123 WCQCVAWSSSGTTKSQKAYIRIARLRKNFEQ EPLAKEVSLEQGIVLPCRPPEGI 176 

I : : I I : : : I : I I : I I I : I I I 
Db 1134 WSEWSSWSAC S C FS LT S T RRRFCQ WD PT VQGFCAGAI LEQ IPCAPGSCS 1183 

Qy 177 PPAE — VEW L RN E D LVD P S L D P N VY I T RE H S L WRQ ARLAD T AN 218 



II II : I I : I : | | : 

Db 1184 PSAGGWSLWSEWSSCSKDCGDTGHQIRNRMCSEP IPSNRGAYCSG 1228 

Qy 219 YT CV7VKNIVARRRSASAAVIVYVNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPA 273 

I : II I : : : I : I I I : I I I I I : I I : I : I I I I 

Db 1229 YSFDQRPCVMDNVCSDEK VDGGWTDWTAWSECTDYCRNGHRSRTRFCANPK 1279 

Qy 274 P LN GGAFC E GQNVQ KT AC — ATLCPV-DGSWSPWSKWSACGLDC THWRSRECSDPAP 327 

I I I I I I : : I I : I I II I I I : I I I I I I I I 

Db 1280 PSQGGAQCTGSDFELNPCFDPARCHLRDGGWSTWSDWTPCSASCGFGVQTRDRSCSSPEP 1339 

Qy 328 RNGGEECQGTDLDTRNCTSDLCVHSASG 355 

: I I : I I I I I I : I 

Db 1340 K-GGQSCSGLAHQTSLCDLPACDHESDG 1366 



RESULT 8 
T00326 

hypothetical protein KIAA0550 - human 
C; Species: Homo sapiens (man) 

C;Date: Ol-Feb-1999 #sequence_revision Ol-Feb-1999 #text_change 21-Jul-2000 
C; Accession: TO 032 6 

R;Nagase, T.; Ishikawa, K. ; Miyajima, N . ; Tanaka, A.; Kotani, H. ; Nomura, N.; 
Ohara, O. 

DNA Res. 5, 31-39, 1998 

A; Title: Prediction of the coding sequences of unidentified human genes. IX. The 
complete sequences of 100 new cDNA clones from brain which can code for large 
proteins in vitro. 

A; Reference number: Z14086; MUID: 98290545; PMID:9628581 
A; Accession: T00326 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: mRNA 
A; Residues: 1-984 <NAG> 

A; Cross-references: EMBL : AB011122 ; NID : g3043623 ; PIDN: BAA25476 . 1 ; PID:g3043624 
A; Experimental source: brain 
C; Genetics : 
A;Note: KIAA0550 

C; Superf amily : thrombospondin type 1 repeat homology 

F; 34 4-3 9 8 /Domain : thrombospondin type 1 repeat homology <THR3> 

Query Match 5.7%; Score 275; DB 2; Length 984; 

Best Local Similarity 39.0%; Pred. No. 1.2e-ll; 

Matches 57; Conservative 20; Mismatches 53; Indels 16; Gaps 6; 

Qy 220 TCVA KNIVARRRSASAAVIVYVNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPA 273 

III: : I : : I : I I I : I I : I I : I I I I : I : I I I I 

Db 317 TCVSPYGTHCSGPLRESRVCNNTALCPVHGVWEEWSPWSLCSFTCGRGQRTRTRSCT — P 374 

Qy 274 PLNGGAFCEGQNVQKTAC-ATLCPVDGSWS PWS KWSACGLDC THWRSRECSDPAPRN 32 9 

I II III I I I II I I I I I I I I : I I I I I : I : I : 

Db 375 PQYGGRPCEGPETHHKPCNIALCPVDGQWQEWSSWSQCSVTCSNGTQQRSRQCT — AAAH 432 



Qy 

Db 



330 GGEECQGTDLDTRNCTSDLCVHSASG 355 

I I I I : I : : I I : I : I : I 
433 GG S EC RG PWAE S RE CYN PEC — TANG 456 



RESULT 9 
T00028 

brain-specific angiogenesis inhibitor 3 - human 
N;Alternate names: BAI3 protein 
C; Species: Homo sapiens (man) 

C;Date: 22-Jan-1999 #sequence__revision 22-Jan-1999 #text_change 21-Jul-2000 
C;Accession: T00028 

R; Shiratsuchi, T.; Nishimori, H.; Ichise, H.; Nakamura, Y. ; Tokino, T. 
Cytogenet. Cell Genet. 79, 103-108, 1997 

A; Title: Cloning and characterization of BAI2 and BAI3, novel genes homologous 
to brain-specific angiogenesis inhibitor 1 (BAI 1) . 
A;Reference number: Z14066; MUID: 98194217 ; PMID:9533023 
A;Accession: T00028 

A; Status: translated from GB/EMBL/DDBJ 
A;Molecule type: mRNA 
A; Residues: 1-1522 <SHI> 

A;Cross-references: EMBL : AB005299 ; NID: g3021700; PIDN : BAA25363 . 1 ; PID:g3021701 

A; Experimental source: brain 

C; Genetics : 

A; Gene: GDB : BAI 3 

A/Cross-references: GDB: 9838090; OMIM: 602684 
A; Map position: 6ql2~6ql2 

C; Superf amily : thrombospondin type 1 repeat homology 

F; 344-398/Domain : thrombospondin type 1 repeat homology <THR3> 

Query Match 5.7%; Score 275; DB 2; Length 1522; 

Best Local Similarity 39.0%; Pred. No. 2.1e-ll; 

Matches 57; Conservative 20; Mismatches 53; Indels 16; Gaps 6 

Qy 220 TCVA KNIVARRRSASAAVIVTWGGWSTWTEWSVCSASCGRGWQKRSRSCTNPA 273 

III: : I : : hi I I : I I : I I : I I I I : I : I I I I 

Db 317 TCVSPYGTHCSGPLRESRVCNNTALCPVHGVWEEWSPWSLCSFTCGRGQRTRTRSCT — P 374 

Qy 274 PLNGGAFCEGQNVQKTAC-ATLCPVDGSWSPWSKWSACGLDC THWRSRECSDPAPRN 329 

I II III I I I I I I I I I I I I I : I I I I I : I : I : 

Db 375 PQYGGRPCEGPETHHKPCNIALCPVDGQWQEWSSWSQCSVTCSNGTQQRSRQCT— AAAH 432 

Qy 330 GGEECQGTDLDTRNCTSDLCVHSASG 355 

II I 1:1 ::| I : I :|:| 

Db 433 GGSECRGPWAESRECYNPEC— TANG 456 



RESULT 10 
T00027 

brain-specific angiogenesis inhibitor 2 - human 
N;Alternate names: BAI2 protein 
C; Species: Homo sapiens (man) 

C;Date: 22-Jan-1999 #sequence_revision 22-Jan-1999 #text_change 21-Jul-2000 
C; Accession: T00027 

R; Shiratsuchi, T.; Nishimori, H. ; Ichise, H.; Nakamura, Y. ; Tokino, T. 
Cytogenet. Cell Genet. 79, 103-108, 1997 

A; Title: Cloning and characterization of BAI2 and BAI3, novel genes homologous 
to brain-specific angiogenesis inhibitor 1 (BAI 1) . 
A;Reference number: Z14066; MUID: 98194217 ; PMID:9533023 
A; Accession: T00027 

A; Status: translated from GB/EMBL/DDBJ 
A; Molecule type: mRNA 



A; Residues: 1-1572 <SHI> 

A; Cross-references: EMBL : AB005298 ; NID : g302 1698 ; PIDN : BAA25362 . 1 ; PID:g3021699 

A; Experimental source: brain 

C; Genetics : 

A; Gene: GDB : BAI2 

A;Cross-references : GDB: 9838089; OMIM: 602683 
A;Map position: Ip35-lp35 

Query Match , 5.7%; Score 274.5; DB 2; Length 1572; 

Best Local Similarity 19.2%; Pred. No. 2.4e-ll; 

Matches 176; Conservative 108; Mismatches 307; Indels 327; Gaps 38; 

Qy 173 PEGIPPAEVEWLRNEDLVDPSLDPNVY 1 T REH S L WRQARL 213 

I I I : : I I : I : I : I : I I I I I 

Db 271 PEEEPKVKTQWPRSAD EPGLYMAQTGDPAAEEWSPWSVCSLTCGQGLQVR-TRS 323 

Qy 214 ADT AN YT CVAKN I VARRRS AS AAVI VYVNGGW S TWT EWS VC S AS CGRGWQKRS RS CTN PA 273 

: : I : : I : : I : I I I I I : I I I I I I I : I I : I 
Db 324 CVSSPYGTLCSGPLRETRPCNNSATCPVHGVWEEWGSWSLCSRSCGRGSRSRMRTCV— P 381 

Qy 274 PLNGGAFCEGQNVQKTACA-TLCPVDGSWSPWSKWSACGLDC THWRSRECSDPAPR- 328 

I : I I 111 : I I : I I I : I I I I I I I I I I : I I I 
Db 382 PQHGGKACEGPELQTKLCSMAACPVEGQWLEWGPWGPCSTSCANGTQQRSRKCSVAGPAW 441 

Qy 329 NGGEECQ 335 

I I : 

Db 442 ATCTGALTDTRECSNLECPATDSKWGPWNAWSLCSKTCDTGWQRRFRMCQATGTQGYPCE 501 

Qy 336 GTDLDTRNCTSDLC — VHSASGPEDVAL 361 

I I : : I : I I III 
Db 502 GTGEEVKPCSEKRCPAFHEMCRDEYVMLMTWKKAAAGEIIYNKCPPNASGSASRRCLLSA 561 

Qy 362 YVGL I AVAVC L VLLLLVLILVYCRKKEGLDSDVADSSILTSGFQPVSIKPSKA 414 

I I I : I I : I : : : I : : : I : : I I : : : 

Db 562 QGVAYWGLPSFARCISHEYRYLYLSLREHLAKGQRMLAGEGMSQVAmS-LQELLARRTYY 620 

Qy 415 DNPHLLTIQPDLSTTTTTYQGSLCPRQDGPSPKFQLT NGHLLSPLGG 461 

I : : : | I : : I I I I : : : I I I 

Db 621 SGDLLFSVDILRNVTDTFKRATYVPSADDVQRFFQVVS FMVDAENKEKWDDAQQVSP — G 678 

Qy 4 62 GRHTLHHSSPTSEAEEFV SRLSTQNYFRSLPRG TSNMTYGTFN 504 

II I : I : I : I I I : I : I : : I : 

Db 67 9 SVHLLR WEDFI HLVGDALKAFQS S LI VT DNLVI S I QREPVSAVS S DI T FPMRG 732 

Qy 505 FLG GRLMI PNTGISLLIP PDAIPRGK 530 

I I I : I : I I I I : I I 

Db 733 RRGMKDWVRHSEDRLFLPKEVLSLSSPGKPATSGAAGSPGRGRGPGTVPPGPGHSHQRLL 792 

Qy 531 IYE-IYLTLHKPEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVIL 574 

: I : I I I I Ml ::: | : | | | : | 

Db 793 PADPDESSYFVIGAVLYRTLGLILPPP RPPLAVTSRVMT — VTVRPPTQPPAEPLIT 847 

Qy 575 A MDHCGEPSPDSWSLRLKKQSCEGSWEDVLHLGEEAPSHLYYCQ-LEASACYV — 626 

: : : I I I : : I I : Mill: 
Db 848 VELSYIINGTTDPHCASWDYS-RADASSGDWD TENCQTLETQAAHTRC 894 



Qy 



627 FTEQLGRFALVGE ALSVAAAKRLKLLLFAPVACTSLEYNIRVYCLHDTHDALKEV 681 



: I I I : : : I : I : : I : : I : I : I : : I I 
Db 8 95 QCQHLSTFAVLAQPPKDLTLELAGSPSVPLVIGCAVSCMALLTLLAIYA AFWRF 948 

Qy 682 VQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLWKSKLLVSYQEIPFYHIWNGTQ 741 

: : I : : I I I I : I : : : I I : : I 
Db 949 IKSERSI ILLNFCLSI — LASNI L I LVGQ S RVL S KGVCTMT A 988 

Qy 742 RYLHCTFTLERVSPSTSDLACKLWV WQVEGDG 773 

: I I I I : II I : 

Db 989 AFLHFFF LS S FCWVLTEAWQS YLAVI GRMRTRLVRKRFLCLGWGLPALV 1037 

Qy 774 QSFSINFNITKDTRFAELLALESEAG-VPALVGPSA FKIPFLIRQKI IS 821 

: I : I I I : I I I : I I I I : I I : : I : II 

Db 1038 VAVSVGFTRTKGYGTSSYCWLSLEGGLLYAFVGPAAVIVLVNMLIGIIVFNKLMARDGIS 1097 

Qy 822 SLDPPCRRGAD WRTL 836 

I I : : I : I 
Db 1098 DKS KKQRAGS ERC PWASL 1115 



RESULT 11 
A40558 

thrombospondin 1 precursor - mouse 
C; Species: Mus musculus (house mouse) 

C;Date: 05-Jun-1992 #sequence_revision 05-Jun-1992 #text_change 20-Aug-1999 
C;Accession: A40558; A37905; B42587; S68787 

R;Lawler, J.; Duquette, M. ; Ferro, P.; Copeland, N.G.; Gilbert, D.J.; Jenkins, 
N.A. 

Genomics 11, 587-600, 1991 

A; Title: Characterization of the murine thrombospondin gene. 

A; Reference number: A40558; MUID: 92128941 ; PMID: 1774063 

A;Accession: A40558 

A; Status: preliminary 

A; Molecule type: DNA 

A; Residues: 1-1170 <LAW> 

A;Cross-references: GB:M62449; GB:M62450; GB:M62451; GB:M62452; GB:M62453; 
GB:M62454; GB:M62455; GB:M62456; GB:M62457; GB:M62458; GB:M62459; GB:M62460; 
GB:M62461; GB:M62462; GB:M62463; GB:M62464; GB:M62465; GB:M62466; GB:M62467; 
GB:M62468; GB:M62469; GB:M62470; NID:g511867; PIDN : AAA50611 . 1 ; PID:g511869 
R;Bornstein, P.; Alfi, D.; Devarayalu, S.; Framson, P.; Li, P. 
J. Biol. Chem. 265, 16691-16698, 1990 

A; Title: Characterization of the mouse thrombospondin gene and evaluation of the 

role of the first intron in human gene expression. 

A;Reference number: A37905; MUID: 90375546; PMID:2398070 

A; Accession: A37905 

A; Status: preliminary 

A; Molecule type: DNA 

A; Residues: 1-4 90 <BOR> 

A/Cross-references: GB:J05605; GB:J05606; NID:g201991; PIDN: AAA40431 . 1; 
PID:g554390 

R;Laherty, CD. ; O'Rourke, K. ; Wolf, F.W. ; Katz, R. ; Seldin, M.F.; Dixit, V.M. 
J. Biol. Chem. 267, 3274-3281, 1992 

A; Title: Characterization of mouse thrombospondin 2 sequence and expression 
during cell growth and development. 

A; Reference number: A42587; MUID: 92147683 ; PMID: 1371115 
A; Access ion: B42587 

A; Status: preliminary; not compared with conceptual translation 



A;Molecule type: mRNA 

A; Residues: 1-1152 P 1 , 1154-1170 <LAH> 
A; Cross-references : GB:M87276 

A; Note: sequence extracted from NCBI backbone (NCBIP: 81501) 
R;Chen, H.; Aeschlimann, D.; Nowlen, J.; Mosher, D.F. 
FEBS Lett. 387, 36-41, 1996 

A;Title: Expression and initial characterization of recombinant mouse 
thrombospondin 1 and thrombospondin 3. 

A; Reference number: S68787; MUID : 96234006; PMID: 8654563 

A; Accession: S 68 7 87 

A;Molecule type: protein 

A; Residues: 19-26, ' X 28-37 <CHE> 

C; Complex: homotrimer, disulfide linked 

C; Superf amily : thrombospondin 1; EGF homology; thrombospondin type 1 repeat 

homology; von Willebrand factor type C repeat homology 

C; Keywords: calcium binding; glycoprotein; homotrimer 

F; 1-18 /Domain : signal sequence #status predicted <SIG> 

F; 19-1 170/ Product : thrombospondin 1 ((status predicted <MAT> 

F; 317-375/Domain : von Willebrand factor type C repeat homology <VWC> 

F; 378-429/Domain : thrombospondin type 1 repeat homology <THR1> 

F; 434-490/Domain : thrombospondin type 1 repeat homology <THR2> 

F; 4 9 1-54 7 /Domain : thrombospondin type 1 repeat homology <THR3> 

F; 55 1-5 8 6/ Domain : EGF homology <EGF> 

F;248, 360, 708, 1067/Binding site: carbohydrate (Asn) (covalent) ((status predicted 

Query Match 5.6%; Score 270.5; DB 2; Length 1170; 

Best Local Similarity 32.2%; Pred. No. 3.2e-ll; 

Matches 57; Conservative 24; Mismatches 71; Indels 25; Gaps 5; 

Qy 207 WRQARLADTAN YT CVAKN I VAR RRSASAAVIVYVNGGWSTWTEWSVCSASC 2 58 

: : : I I : I I : ■ : I : I : I I I I I : I I I I : I 

Db 399 IQQRGRSCDSLNNRCEGSSVQTRTCHIQECDKRFKQ DGGWSHWSPWSSCSVTC 451 

Qy 259 GRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTAC-ATLCPVDGSWSPWSKWSACGLDC 314 

II I I I : I : I I Ml: : II I I : : I I I I I I I : I 

Db 452 GDGVITRIRLCNSPSPQMNGKPCEGEARETKACKKDACPINGGWGPWSPWDICSVTCGGG 511 

Qy 315 THWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSASGPEDVALYVGLIAVAVC 371 

I I I I : : I I : I I : : I I : : I I III III 

Db 512 VQRRSRLCNNPTPQFGGKDCVGDVTENQVCNKQDC PIDGCLSNPCFAGAKC 562 



RESULT 12 
TSHUP1 

thrombospondin 1 precursor - human 
C; Species: Homo sapiens (man) 

C;Date: 23-Aug-1987 #sequence_revision 03-Aug-1995 #text_change 17-Nov-2000 

C;Accession: A26155; A34274; A30140; A25812; A05172; A42927 

R;Lawler, J.; Hynes, R.O. 

J. Cell Biol. 103, 1635-1648, 1986 

A; Title: The structure of human thrombospondin, an adhesive glycoprotein with 
multiple calcium-binding sites and homologies with several different proteins. 
A; Reference number: A26155; MUID : 87057617 ; PMID: 2430973 
A;Accession: A26155 
A; Molecule type: mRNA 
A; Residues: 1-1170 <LAW> 

A; Cross-references: GB:X04665; NID:g37137; PIDN : CAA28370 . 1 ; PID:g37138 



A;Note: parts of this sequence, including the amino end of the mature protein, 
were determined by protein sequencing 
R;Laherty, CD.; Gierman, T.M. ; Dixit, V.M. 
J. Biol. Chem. 264, 11222-11227, 1989 

A; Title: Characterization of the promoter region of the human thrombospondin 

gene. DNA sequences within the first intron increase transcription. 

A; Reference number: A34274; MUID: 89291870; PMID: 2544587 

A;Accession: A34274 

A; Molecule type: DNA 

A; Residues: 1-166 <LAH> 

A; Cross-references : GB: J04835 

R;Hennessy, S.W.; Frazier, B.A.; Kim, D.D.; Deckwerth, T.L.; Baumgartel, D.M.; 

Rotwein, P.; Frazier, W.A. 

J. Cell Biol. 108, 729-736, 1989 

A; Title: Complete thrombospondin mRNA sequence includes potential regulatory 
sites in the 3 f untranslated region. 

A; Reference number: A30140; MUID : 89139590 ; PMID:2918029 
A;Accession: A30140 
A;Molecule type: mRNA 

A; Residues: 1-83, 'A 1 , 85-522, 'A 1 , 524-1170 <HEN> 

A; Cross-references: EMBL:X14787; NID:g37464; PIDN : CAA3288 9 . 1 ; PID:g37465 

A; Note: parts of this sequence, including the amino end of the mature protein, 

were determined by protein sequencing 

R;Kobayashi, S.; Eden-McCutchan, F. ; Framson, P.; Bornstein, P. 
Biochemistry 25, 8418-8425, 1986 

A; Title: Partial amino acid sequence of human thrombospondin as determined by 

analysis of cDNA clones: homology to malarial circumsporozoite proteins. 

A;Reference number: A25812; MUID : 87 157592 ; PMID:3030396 

A;Accession: A25812 

A;Molecule type: mRNA 

A; Residues: 1-83, *A ? , 85-397 <KOB> 

A; Cross-references: GB:M25631; NID:g538353; PIDN : AAA36741 . 1 ; PID:g538354 

R;Dixit, V.M. ; Hennessy, S.W.; Grant, G.A.; Rotwein, P.; Frazier, W.A. 

Proc. Natl. Acad. Sci. U.S.A. 83, 5449-5453, 1986 

A; Reference number: A05172; MUID: 86287276; PMID: 3461443 

A;Accession: A05172 

A;Molecule type: mRNA 

A;Residues: 1-8 3, 1 A 1 , 85-374, 1 RC 1 <DIX> 

A; Cross-references: GB:M14326; NID:g340005; PIDN : AAA61237 . 1 ; PID:g553801 

A; Note: parts of this sequence, including the amino end of the mature protein, 

were determined by protein sequencing 

R;Sun, X.; Skorstengaard, K. ; Mosher, D.F. 

J. Cell Biol. 118, 693-701, 1992 

A; Title: Disulfides modulate RGD-inhibitable cell adhesive activity of 
thrombospondin . 

A;Reference number: A42927; MUID : 92348511 ; PMID: 1379247 
A;Accession: A42927 
A;Molecule type: protein 
A; Residues: 987-1003 <SUN> 

A;Note: Cys-992 is shown to have a free sulfhydryl 
C; Genetics : 

A; Gene: GDB : THBS1 ; TSPl; TSP 

A; Cross-references: GDB:120438; OMIM:188060 

A; Map position: 15ql5-15ql5 

A;Introns: 23/1 

A;Note: the list of introns may be incomplete 
C; Complex: homotrimer, disulfide linked 



C; Function: 

A; Description: participates in cell migration and adhesion, and in platelet 
aggregation 

C; Super family : thrombospondin 1; EGF homology; thrombospondin type 1 repeat 
homology; von Willebrand factor type C repeat homology 
C; Keywords: beta-hydroxyasparagine; calcium binding; cell adhesion; 
glycoprotein; trimer 

F; 1-18/Domain: signal sequence #status predicted <SIG> 

F; 19- 1170/ Product : thrombospondin 1 (fstatus predicted <MAT> 

F; 317-375/Domain : von Willebrand factor type C repeat homology <VWC> 

F; 378-429/Domain : thrombospondin type 1 repeat homology <THR1> 

F; 434-490/Domain : thrombospondin type 1 repeat homology <THR2> 

F; 491-547/Domain: thrombospondin type 1 repeat homology <THR3> 

F;551-586/Domain: EGF homology <EGF1> 

F; 650-689/Domain: EGF homology <EGF2> 

F; 926-928/Region: cell attachment (R-G-D) motif 

F; 171-232/Disulf ide bonds: #status predicted 

F; 248 , 360, 708, 1067/Binding site: carbohydrate (Asn) (covalent) #status predicted 
F;270, 274/Disulf ide bonds: interchain #status predicted 

F; 610/Modified site: erythro-beta-hydroxyasparagine (Asn) #status predicted 
F; 1051/Binding site: carbohydrate (Asn) (covalent) #status absent 

Query Match 5.6%; Score 268.5; DB 1; Length 1170; 

Best Local Similarity 32.9%; Pred. No. 4.4e-ll; 

Matches 51; Conservative 24; Mismatches 61; Indels 19; Gaps 4; 

Qy 207 WRQARLADTAN YTCVAKN I VAR RRSASAAVIVYVNGGWSTWTEWSVCSASC 258 

: : : I I : i I ' ' I : I : I I I I I : I I I I : I 

Db 399 IQQRGRSCDSLNNRCEGSSVQTRTCHIQECDKRFKQ DGGWSHWSPWSSCSVTC 451 

Qy 259 GRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTAC-ATLCPVDGSWSPWSKWSACGLDC 314 

II I I I : I : I I III: : II I I : : I I I I I I I : I 

Db 452 GDGVITRI RLCNS PS PQMNGKPCEGEARETKACKKDACP INGGWGPWS PWDI CSVTCGGG 511 

Qy 315 THWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLC 34 9 

I I I I : : I I : I I : : I I : : I I 
Db 512 VQKRSRLCNNPTPQFGGKDCVGDVTENQICNKQDC 546 



RESULT 13 
A39804 

thrombospondin precursor - chicken 
C; Species: Gallus gallus (chicken) 

C;Date: 10-Sep-1999 #sequence_revision 10-Sep-1999 #text_change 10-Sep-1999 
C;Accession: A39804 

R;Lawler, J.; Duquette, M. ; Ferro, P. 
J. Biol. Chem. 266, 8039-8043, 1991 

A; Title: Cloning and sequencing of chicken thrombospondin. 
A; Reference number: A39804; MUID : 91217026; PMID:2022631 
A;Accession: A39804 
A; Status: preliminary 
A;Molecule type: mRNA 
A; Residues: 1-1178 <LAW> 

A;Cross-references : GB:M60853; NID:g212763; PIDN : AAA51437 . 1 ; PID:g212764 
C; Super family : thrombospondin 1; EGF homology; thrombospondin type 1 repeat 
homology; von Willebrand factor type C repeat homology 
F;325-383/Domain: von Willebrand factor type C repeat homology <VWC> 



F;386-437/Domain: thrombospondin type 1 repeat homology <THR1> 

F; 4 42-4 9 8 /Domain : thrombospondin type 1 repeat homology <THR2> 

F;499-555/Domain: thrombospondin type 1 repeat homology <THR3> 

F; 65 8 -69 7 /Domain : EGF homology <EGF> 



Query Match 5.5%; Score 263; DB 1; Length 1178; 

Best Local Similarity 36.2%; Pred. No. l.le-10; 

Matches 58; Conservative 16; Mismatches 70; Indels 16; 



Gaps 



Qy 



Db 



210 QARl^TANYTCVAKNIVARRRS-ASAAVIVTWGGWSTWTEWSVCSASCGRGWQKRSRS 268 

: I I I : I I I : : I I I I I : I I I I : I I I II 

410 RGRSCDVTRSACTGPHIQTRMCSFKKCDHRIRQDGGWSHWSPWSSCSVTCGVGNITRIRL 469 



Qy 

Db 

Qy 

Db 



269 CTNPAPLNGGAFCEGQNVQKTACATL-CPVDGSWSPWSKWSACGLDC THWRSRECSD 324 

I : I I II II : I I I I : I I I I I I I I I : I ' Mil: 

470 CNSPIPQMGGKNCVGNGRETEKCEKAPCPVNGQWGPWSPWSACTVTCGGGIRERSRLCNS 529 

325 PAPRNGGEECQGTDLDT RNCTSDLCVHSASGP 356 

I I : I I : I I II | : | | | : : | 

530 PEPQYGGKPCVG DTKQHDMCNKRDCPIDGCLSNPCFP 566 



RESULT 14 
S29126 

properdin precursor [validated] - human 
N;Alternate names: factor P 
C; Species: Homo sapiens (man) 

C;Date: 17-Nov-2000 #sequence_revision 17-Nov-2000 #text_change 17-Nov-2000 
C;Accession: S29126; S16150; A05319; T45112; T45113 

R;Nolan, K.F.; Kaluz, S.; Higgins, J.M.G.; Goundis, D.; Reid, K.B.M. 
Biochem. J. 287, 291-297, 1992 

A; Title: Characterization of the human properdin gene. 
A; Reference number: S29126; MUID: 93038568 ; PMID: 1417780 
A; Accession: S2912 6 
A; Molecule type: DNA 

A; Residues: 1-469 <N0L1> < 

A;Cross-references: EMBL:X70872; NID:g35679; PIDN : CAA5022 0 . 1 ; PID:g35680 
R;Nolan, K.F.; Schwaeble, W. ; Kaluz, S.; Dierich, M.P.; Reid, K.B.M. 
Eur. J. Immunol. 21, 771-776, 1991 

A; Title: Molecular cloning of the cDNA coding for properdin, a positive 
regulator of the alternative pathway of human complement. 
A; Reference number: S16150; MUID: 91184288 ; PMID:2009915 
A;Accession: S16150 
A;Molecule type: mRNA 

A;Residues: 1-456, 1 R' , 458-469 <NOL2> 
A; Cross-references : EMBL:X57748 
R;Reid, K.B.M.; Gagnon, J. 
Mol. Immunol. 18, 949-959, 1981 

A;Reference number: A05319; MUID: 82195224 ; PMID:7341961 
A; Accession: A05319 
A;Molecule type: protein 

A; Residues: 28-53, 'Q', 55-59, 'G',61, 1 1 1 , 63 ; 137-138 , 'P', 140-141, 'P',143- 

144, 'X', 146-148, 1 Y 1 , 150, 1 S 1 , 152, 'Y', 154-156, 'XSXGXA' ; 162-163, T E',165- 

172, 'XM74-176, 'X',17 8, * V , 18 0 ; 223-228 , »X', 230-232, 1 GX » , 235-238 , , GH , ,241- 

245; 248-251, 'X 1 , 253-257, 1 P ' , 259, f G f , 261, 1 XPP 1 , 265-266, f X \ 268-269; 280- 

285, ! X f , 287-290, , X , ,292, 'H f , 294-300, 1 SXXX », 305-307 , ! X f ,309-315, 'K* ,317; 333- 



341,343-357, 'X', 359-362, 1 EXE ' ; 393-4 04, ' QK » , 407 ; 421-427 , 'R\ 429-443, 'TKV',447- 
448, 'XX',451, 'RX', 454-455 <REI> 

R;Westberg, J.; Nordin-Fredrikson, G.; Truedsson, L.; Sjoholm, A.G.; Uhlen, M. 
submitted to the EMBL Data Library, May 1997 
A; Reference number: Z22914 
A;Accession: T45112 

A; Status: translated from GB/EMBL/DDBJ 
A; Molecule type: DNA 

A;Residues: 1-54, 'X' , 56-73, 'X' , 75-99, 'W , 101-469 <WES1> 
A; Cross-references : EMBL : AF005665 ; PIDN: AAB63280 . 1 

A; Experimental source: genomic DNA from individual with properdin deficiency 
type II 

A; Accession: T45113 

A; Status: translated from GB/EMBL/DDBJ 
A; Molecule type: DNA 

A;Residues: 1-60, 'X' , 62-413, 1 D' , 415-452, 'XX 1 , 455-469 <WE2> 
A; Cross-references: EMBL : AFO 05666; PIDN : AAC51626 . 1 

A; Experimental source: genomic DNA from individual with properdin deficiency 
type III 

R;Hartmann, S.; Hofsteenge, J. 

J. Biol. Chem. 275, 28569-28574, 2000 

A;Title: Properdin, the positive regulator of complement, is highly C- 
mannosylated. 

A; Reference number: A59360; MUID: 20435812; PMID: 10878002 
A; Contents: annotation 

A; Note: identification and location of C-mannosylation sites by mass- 

spectroscopy 

C; Genetics: 

A; Gene: GDB : PFC 

A;Cross-references: GDB: 120275; OMIM: 312060 
A;Map position: Xpll . 3-Xpll . 2 3 

A;Introns: 26/1; 76/2; 135/1; 192/1; 256/1; 314/1; 378/1; 415/2 
C; Complex: a mixture of h omo dirtier s , homot rimers and homotetramers 
C; Function : 

A; Description: protects C3 convertase (C3bBb) from rapid inactivation 
A; Pathway: complement alternate pathway 

C; Superf amily : human properdin precursor; thrombospondin type 1 repeat homology 
C; Keywords: complement alternate pathway; glycoprotein; homodimer; homotetramer ; 
homot rimer; plasma 

F; 1-2 7 /Domain : signal sequence #status predicted <SIG> 

F; 2 8-4 69/ Product : properdin #status experimental <MAT> 

F; 7 6- 12 8 /Domain: thrombospondin type 1 repeat homology <THRl> 

F; 135-191/Domain: thrombospondin type 1 repeat homology <THR2> 

F; 192-255/Domain: thrombospondin type 1 repeat homology <THR3> 

F;256-313/Domain: thrombospondin type 1 repeat homology <THR4> 

F; 314-377/Domain : thrombospondin type 1 repeat homology <THR5> 

F;378-440/Domain: thrombospondin type 1 repeat homology <THR6> 

F;83, 86, 139, 142, 145, 196, 199, 260, 263, 321, 324 , 382, 385, 388/Modif ied site: 2'- 

mannosyl-tryptophan (Trp) #status experimental 

F;428/Binding site: carbohydrate (Asn) (covalent) #status predicted 

Query Match 5.1%; Score 243; DB 1; Length 469; 

Best Local Similarity 39.5%; Pred. No. 9.2e-10; 

Matches 45; Conservative 14; Mismatches 43; Indels 12; Gaps 4; 

Qy 243 GGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACAT— LCPVDGS 300 

I I I I I I I I : I : I : I I : I : I I I | | III : III : I I I : 



Db 



137 GGWSGWGPWEPCSVTCSKGTRTRRRACNHPAPKCGG-HCPGQAQESEACDTQQVCPTHGA 195 



Qy 



Db 



301 WSPWSKWSACGLDC THWRSRECSDPAP — RNGGEECQGTDLDTRNCT 345 

I : I I : I I I I I : I I I I : I : I I : I I I 

196 WATWGPWTPCSASCHGGPHEPKETRSRKCSAPEPSQKPPGKPCPGLAYEQRRCT 249 



RESULT 15 
S05478 

properdin - mouse (fragment) 

C; Species: Mus musculus (house mouse) 

C;Date: 07-Sep-1990 #sequence_revision 07-Sep-1990 #text_change 17-Nov-2000 
C;Accession: S05478 
R;Goundis, D.; Reid, K.B.M. 
Nature 335, 82-85, 1988 

A; Title: Properdin, the terminal complement components, thrombospondin and the 
circumsporozoite protein of malaria parasites contain similar sequence motifs. 
A;Reference number: S05478; MUID : 88318954 ; PMID:3045564 
A;Accession: S05478 
A;Molecule type : mRNA 
A; Residues: 1-437 <GOU> 

A;Cross-references : EMBL:X12905; NID:g53786; PIDN : CAA31389 . 1 ; PID:g53787 
C; Complex: a mixture of homodimers, homotrimers and homotetramers 
C; Function: 

A; Description: protects C3 convertase (C3bBb) from rapid inactivation 
A; Pathway: complement alternate pathway 

C; Super family : human properdin precursor; thrombospondin type 1 repeat homology 
C;Keywords: complement alternate pathway; glycoprotein; homodimer; homotetramer 
homot rimer; plasma 

F; 4 5- 97 /Domain : thrombospondin type 1 repeat homology <THR1> 
F;104-160/Domain: thrombospondin type 1 repeat homology <THR2> 
F; 161-22 4 /Domain : thrombospondin type 1 repeat homology <THR3> 
F;225-282/Domain: thrombospondin type 1 repeat homology <THR4> 
F;283-345/Domain: thrombospondin type 1 repeat homology <THR5> 
F; 34 6-4 08 /Domain: thrombospondin type 1 repeat homology <THR6> 
F;52,55, 108,111, 114,165, 168 , 229, 232 , 290, 293 , 350, 353, 356/Modif ied site: 2' - 
mannosyl-tryptophan (Trp) ((status predicted 

F;366, 396/Binding site: carbohydrate (Asn) (covalent) #status predicted 

Query Match 4.8%; Score 229; DB 2; Length 437; 

Best Local Similarity 40.4%; Pred. No. 8.7e-09; 

Matches 46; Conservative 10; Mismatches 46; Indels 12; Gaps 4; 

Qy 243 GGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACAT-- LCPVDGS 300 

I I I I I I I I : I : I I I I I I I I I II II: I III M I : 
Db 106 GGWSEWGPWGPCSVTCSKGTQIRQRVCDNPAPKCGG-HCPGEAQQSQACDTQKTCPTHGA 164 

Qy 301 WSPWSKWSACGLDC THWRSRECSDPAPRN — GGEECQGTDLDTRNCT 345 

hill I I I I I I I I I : I : I I : : I : 

Db 165 WASWGPWSPRSGSCLGGAQEPKETRSRSCSAPAPSHQPPGKPCSGPAYEHKACS 218 



Search completed: July 12, 2004, 23:01:22 
Job time : 35 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 
Run on : 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



Searched: 



July 12, 2004, 23:00:51 ; Search time 97 Seconds 

(without alignments) 
2887.655 Million cell updates/sec 

US-10-624-932-2 
4791 

1 MAVRPGLWPALLGIVLAAWL AVAGL GQ P DAGL FT VS EAE C 8 98 

BLOSUM62 

Gapop 10.0 , Gapext 0.5 

1279676 seqs, 311918243 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



1279676 



Database 



Published_Applications_AA: 



1 
2 
3 
4 
5 
6 
7 
8 
9 

10 
11 
12 
13 
14 
15 
16 
17 
18 



/cgn2_6/ptodata/2/pubpaa/US07_PUBCOMB.pep: * 
/cgn2_6/ptodata/2/pubpaa/PCT_NEW_PUB.pep: * 
/cgn2_6/ptodata/2/pubpaa/US06_NEW_PUB.pep: * 
/cgn2_6/ptodata/2/pubpaa/US06_PUBCOMB.pep: * 
/cgn2_6/ptodata/2/pubpaa/US07_NEW_PUB.pep: * 
/cgn2_6/ptodata/2/pubpaa/PCTUS_PUBCOMB.pep: 
/cgn2_6/ptodata/2/pubpaa/US08_NEW_PUB.pep: * 
/cgn2_6/ptodata/2/pubpaa/US08_PUBCOMB.pep: * 
/cgn2_6/ptodata/2/pubpaa/US09A_PUBCOMB.pep: 
/cgn2_6/ptodata/2/pubpaa/US09B_PUBCOMB.pep 
/cgn2_6/ptodata/2/pubpaa/US09C_PUBCOMB.pep 
/cgn2_6/ptodata/2/pubpaa/US09_NEW_PUB.pep: 
/cgn2__6/ptodata/2/pubpaa/US10A_PUBCOMB.pep 
/cgn2_6/ptodata/2/pubpaa/US10B_PUBCOMB.pep 
/cgn2_6/ptodata/2/pubpaa/US10C_PUBCOMB.pep 
/ cgn2_6/ptoda t a/ 2 /pubpaa/US 1 0__NEW_PUB . pep : 
/cgn2_6/ptodata/2/pubpaa/US60__NEW_PUB.pep: 
/cgn2_6/ptodata/2/pubpaa/US60_PUBCOMB.pep: 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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ALIGNMENTS 



RESULT 1 
US-09-918-779-2 

; Sequence 2, Application US/09918779 
; Publication No. US20030064369A1 
; GENERAL INFORMATION: 
; APPLICANT: Taupier, Raymond 



APPLICANT: Padigaru, Muralidhara 
APPLICANT: Rastelli, Luca 
APPLICANT: Spaderna, Steven 
APPLICANT: Shimkets, Richard 
APPLICANT: Zerhusen, Bryan 
APPLICANT: Spytek, Kimberly 
APPLICANT: Shenoy, Suresh 
APPLICANT: Li, Li 
APPLICANT: Gusev, Vladimir 
APPLICANT: Grosse, William 
APPLICANT: Alsobrook, John 
APPLICANT: Lepley, Denise 
APPLICANT: Burgess, Catherine 
APPLICANT: Gerlach, Valerie 
APPLICANT: Ellerman, Karen 
APPLICANT: MacDougall, John 
APPLICANT: Stone, David 
APPLICANT: Smithson, Glennda 

TITLE OF INVENTION: Novel Proteins and Nucleic Acids Encoding Same 
FILE REFERENCE: 21402-074 US 
CURRENT APPLICATION NUMBER: US/09/918,779 
CURRENT FILING DATE: 2001-07-30 
PRIOR APPLICATION NUMBER: 60/221,409 
PRIOR FILING DATE: 2000-07-28 
PRIOR APPLICATION NUMBER: 60/222,840 
PRIOR FILING DATE: 2000-08-04 
PRIOR APPLICATION NUMBER: 60/223,752 
PRIOR FILING DATE: 2000-08-08 
PRIOR APPLICATION NUMBER: 60/223,762 
PRIOR FILING DATE: 2000-08-08 
PRIOR APPLICATION NUMBER: 60/223,770 
PRIOR FILING DATE: 2000-08-08 
PRIOR APPLICATION NUMBER: 60/223,769 
PRIOR FILING DATE: 2000-08-08 
PRIOR APPLICATION NUMBER: 60/225,146 
PRIOR FILING DATE: 2000-08-14 
PRIOR APPLICATION NUMBER: 60/225,392 
PRIOR FILING DATE: 2000-08-15 
PRIOR APPLICATION NUMBER: 60/225,470 
PRIOR FILING DATE: 2000-08-15 
PRIOR APPLICATION NUMBER: 60/225,697 
PRIOR FILING DATE: 2000-08-16 
PRIOR APPLICATION NUMBER: 60/263,662 
PRIOR FILING DATE: 2001-02-01 
PRIOR APPLICATION NUMBER: 60/281,645 
PRIOR FILING DATE: 2001-04-05 
NUMBER OF SEQ ID NOS : 61 
SOFTWARE: Patent In Ver. 2.1 
SEQ ID NO 2 
LENGTH: 8 98 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-09-918-779-2 



Query Match 100.0%; Score 4791; DB 12; Length 898; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 898; Conservative 0; Mismatches 0; Indels 0; Gaps 



0; 



Qy 1 MAVRPGLWPALLGIVLAAWLRGSGAQQSATVANPVPGANPDLLPHFLVEPEDVYIVKNKP 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MAVRPGLWPALLGIVLAAWLRGSGAQQSATVANPVPGANPDLLPHFLVEPEDVYIVKNKP 60 

Qy 61 VLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVS RQQVEKVFGLE 12 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 VLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVS RQQVEKVFGLE 120 

Qy 121 EYWCQCVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAE 180 

I I I I I I I I I > I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 EYWCQCVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAE 180 

Qy 181 VEWLRN E D LVD P S LDPNVY I T REH S L WRQARLADTAN YT CVAKN I VARRRS AS AAVI VY 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 VEW L RN E DL VD P S L D PN VY I T REH S L WRQARLADTAN YT CVAKN I VARRRS AS AAVI VY 24 0 

Qy 241 VNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGS 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 VNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGS 300 

Qy 301 WSPWSKWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSASGPEDVA 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1.1 I I I I I I I I I I I I I I I I I I I 
Db 301 WSPWSKWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSASGPEDVA 360 

Qy 361 LYVGLIAVAVCLVLLLLVLILVYCRKKEGLDSDVADSSILTSGFQPVSIKPSKADNPHLL 42 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 LYVGLIAVAVCLVLLLLVLILVYCRKKEGLDSDVADSSILTSGFQPVSIKPSKADNPHLL 42 0 

Qy 421 TIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSSPTSEAEEFVS 480 

I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I 

Db 421 TIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSSPTSEAEEFVS 4 80 

Qy 481 RLSTQNYFRSLPRGTSNMTYGTFNFLGGRLMIPNTGISLLIPPDAIPRGKIYEIYLTLHK 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I | I | | | | | | | M I I I I I I I I I I I I I I II I I I II 
Db 481 RLSTQNYFRSLPRGTSNMTYGTFNFLGGRLMIPNTGISLLIPPDAIPRGKIYEIYLTLHK 540 

Qy 541 PEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVILAMDHCGEPSPDSWSLRLKKQSCEGSW 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 541 PEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVI LAMDHCGEPSPDSWSLRLKKQSCEGSW 600 

Qy 601 EDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALSVAAAKRLKLLLFAPVACT 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 601 EDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALSVAAAKRLKLLLFAPVACT 660 

Qy 661 SLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLW 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 661 SLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLW 720 

Qy 721 KSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWVWQVEGDGQSFSINF 780 

I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 721 KSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWVWQVEGDGQSFSINF 780 

Qy 781 NITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISSLDPPCRRGADWRTLAQKL 840 

II M I I I I I I I I I I M II I I I I I I I I I I i I I I I I I I I I I I 

Db 781 NITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISSLDPPCRRGADWRTLAQKL 84 0 



Qy 841 HLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQIiAAAVAGLGQPDAGLFTVSEAEC 898 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 841 HLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVSEAEC 898 



RESULT 2 
US-10-624-932-2 

Sequence 2, Application US/10624932 
Publication No. US20040096877A1 
GENERAL INFORMATION: 
APPLICANT: Taupier, Raymond 
APPLICANT: Padigaru, Muralidhara 
APPLICANT: Rastelli, Luca 
APPLICANT: Spaderna, Steven 
APPLICANT: Shimkets, Richard 
APPLICANT: Zerhusen, Bryan 
APPLICANT: Spytek, Kimberly 
APPLICANT: Shenoy, Suresh 
APPLICANT: Li, Li 
APPLICANT: Gusev, Vladimir 
APPLICANT: Grosse, William 
APPLICANT: Alsobrook, John 
APPLICANT: Lepley, Denise 
APPLICANT: Burgess, Catherine 
APPLICANT: Gerlach, Valerie 
APPLICANT: Ellerman, Karen 
APPLICANT: MacDougall, John 
APPLICANT: Stone, David 
APPLICANT: Smithson, Glennda 

TITLE OF INVENTION: Novel Proteins and Nucleic Acids Encoding Same 
FILE REFERENCE: 21402-074 US 
CURRENT APPLICATION NUMBER: US/10/624 , 932 
CURRENT FILING DATE: 2003-07-21 
PRIOR APPLICATION NUMBER: 09/918,779 
PRIOR FILING DATE: 2001-07-03 
PRIOR APPLICATION NUMBER: 60/221,4 09 
PRIOR FILING DATE: 2000-07-28 
PRIOR APPLICATION NUMBER: 60/222,840 
PRIOR FILING DATE: 2000-08-04 
PRIOR APPLICATION NUMBER: 60/223,752 
PRIOR FILING DATE: 2000-08-08 
PRIOR APPLICATION NUMBER: 60/223,762 
PRIOR FILING DATE: 2000-08-08 
PRIOR APPLICATION NUMBER: 60/223,770 
PRIOR FILING DATE: 2000-08-08 
PRIOR APPLICATION NUMBER: 60/223,769 
PRIOR FILING DATE: 2000-08-08 
PRIOR APPLICATION NUMBER: 60/225,146 
PRIOR FILING DATE: 2000-08-14 
PRIOR APPLICATION NUMBER: 60/225,392 
PRIOR FILING DATE: 2000-08-15 
PRIOR APPLICATION NUMBER: 60/225,470 
PRIOR FILING DATE: 2000-08-15 

Remaining Prior Application data removed - See File Wrapper or PALM. 
NUMBER OF SEQ ID NOS : 61. 
SOFTWARE: Patentln Ver. 2.1 
SEQ ID NO 2 



LENGTH: 898 
; TYPE: PRT 

; ORGANISM: Homo sapiens 
US-10-624-932-2 

Query Match 100.0%; Score 4791; DB 16; Length 898; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 898; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MAVRPGLWPALLGIVLAAWLRGSGAQQSATVANPVPGANPDLLPHFLVEPEDVYIVKNKP 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MAVRPGLWPALLGIVIJUVWLRGSGAQQSATVANPVPGANPDLLPHFLVEPEDVYIVKNKP 60 

Qy 61 VLLVCKAVPATQI FFKCNGEWVRQVDHVI ERSTDGS S GLPTMEVRINVSRQQVEKVFGLE 120 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I.I I I I I I I I I I I I I I I 
Db 61 VLLVCKAVPATQ I FFKCNGEWVRQVDHVIERSTDGSS GLPTMEVRINVSRQQVEKVFGLE 120 

Qy 121 EYWCQCVAWS S SGTTKSQKAYI RI ARLRKNFEQEPLAKEVSLEQGI VLPCRPPEGI PPAE 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 EYWCQCVAWS SSGTTKSQKAYIRIARLRKNFEQEPLAJ<EVSLEQGIVLPCRPPEGIPP7\E 180 

Qy 181 VEWLRNEDLVDPSLDPNWITREHSLWRQARLADTANYTCVAKNIVARRRSASAAVIVY 240 

I I I I I I I I I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 VTSWLRNEDLVDPSLDPNWITREHSLVVRQARLADTANYTC^ 240 

Qy 241 VNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGS 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 241 VNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGS 300 

Qy 301 WSPWSKWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSASGPEDVA 360 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I II I I I 
Db 301 WSPWSKWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSASGPEDVA 360 

Qy 361 LWGLIAVAVCLVXLLLVLILVYCRKKEGLDSDVADSSILTSGFQPVSIKPSKADNPHLL 420 

I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I 
Db 361 LYVGLI AVAVCLVLLLLVLI LVYCRKKEGLDS DVADS S I LT S GFQPVS I KP S KADNPHLL 420 

Qy 421 TIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSSPTSEAEEFVS 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 TIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSSPTSEAEEFVS 480 

Qy 481 RLSTQNYFRSLPRGTSNMTYGT FNFLGGRLMI PNTGI SLLI PPDAI PRGKI YEI YLTLHK 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 481 RLSTQNYFRSLPRGTSNMTYGT FNFLGGRLMI PNTGI SLLI PPDAI PRGKI YEI YLTLHK 540 

Qy 541 PEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVILAMDHCGEPSPDSWSLRLKKQSCEGSW 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 541 PEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVI LAMDHCGEPSPDSWSLRLKKQSCEGSW 600 

Qy 601 EDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALSVAAAKRLKLLLFAPVACT 660 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 601 EDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALSVAAAKRLKLLLFAPVACT 660 



Qy 

Db 



661 SLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLW 720 

I I I I I I I I I I I I I I I I I I I | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
661 SLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLW 720 



Qy 721 KSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWWQVEGDGQSFSINF 780 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 721 KSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWVWQVEGDGQSFSINF 780 

Qy 781 NITKDTRFAELIALESEAGVPALVGPSAFKIPFLIRQKIISSLDPPCRRGADWRTLAQKL 840 

II I I I I I M I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 781 NITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISSLDPPCRRGADWRTLAQKL 840 

Qy 841 HLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVSEAEC 898 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 841 HLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVSEAEC 898 



RESULT 3 
US-09-970-944-2 

; Sequence 2, Application US/09970944 
; Publication No. US20030204052A1 
; GENERAL INFORMATION: 
; APPLICANT: Herrman, John L 
APPLICANT: Rastelli, Luca 
; APPLICANT: Shimkets, Richard A 

; TITLE OF INVENTION: No. US20030204052Alel Proteins and Nucleic Acids Encoding 
Same and 

TITLE OF INVENTION: Antibodies Directed Against these Proteins 
; FILE REFERENCE: 21402-138 

; CURRENT APPLICATION NUMBER: US/09/970, 944 
; CURRENT FILING DATE: 2002-05-02 

PRIOR APPLICATION NUMBER: 60/237,862 
; PRIOR FILING DATE: 2000-10-04 
; NUMBER OF SEQ ID NOS : 62 
; SOFTWARE: Patentln Ver. 2.1 
; SEQ ID NO 2 

LENGTH: 899 
; TYPE: PRT 

; ORGANISM: Homo sapiens 
US-09-970-944-2 

Query Match 98.1%; Score 4698.5; DB 11; Length 899; 

Best Local Similarity 98.7%; Pred. No. 0; 

Matches 888; Conservative 2; Mismatches 7; Indels 3; Gaps 3; 



Qy 1 MAVRPGLWPALLGI VTjAAWLRGSGAQQSATVANPVPGANPDLLPHFLVEPEDVYIVKNKP 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MAVRPGLWPALLGIVLAAWLRGSGAQQSATVANPVPGT^JPDLLPHFLVEPEDVYIVKNKP 60 

Qy 61 VLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLE 12 0 

I I I I I I I I I I I II I I I I I II M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 VLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGEPTMEVRINVSRQQVEKVFGLE 120 

Qy 121 EYWCQCVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAE 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 EYWCQCVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAE 180 

Qy 181 VEWLRN EDLVD P S LD PNVYI T REH S LVVRQARLADTAN YTCVAKN I VARRRS AS AAVI VY 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I II I I I I I I I I I 
Db 181 VEW L RN E DL VD P S L D P N VY I T REH S LVVRQARLADTAN YT C VAKN I VARRR S AS AAVI VY 240 



Qy 241 VNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNV-QKTACATLCPVDG 299 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I : I III 

Db 241 VNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVHDRTVSSLLVSVDG 300 

Qy 300 SWSPWSKWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSASGPEDV 359 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I 

Db 301 SWSPWSKWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSASGPEDV 360 

Qy 360 ALYVGLIAVAVCLVLLLLVLILVYCRKKEGLDSDVADSSILTSGFQPVSIKPSKADNPHL 419 

I I I I I I I II I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 ALYVGLIAVAVCLVLLLLVLILVYCRKKEGLDSDVADSSILTSGFQPVSIKPSKADNPHL 420 

Qy 420 LTIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSSPTSEAEEFV 47 9 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 421 LTIQPDLS-TTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSSPTSEAEEFV 479 

Qy 480 SRLSTQNYFRSLPRGTSNMTYGTFNFLGGRLMIPNTGISLLIPPDAIPRGKIYEIYLTLH 539 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 480 SRLSTQNYFRSLPRGTSNMTYGTFNFLGGRLMI PNTGI SLLI PPDAI PRGKI YEI YLTLH 539 

Qy 540 KPEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVI LAMDHCGEPSPDSWSLRLKKQSCEGS 599 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I II I M I I I I I I I I I I I I I I I I I I 

Db 54 0 KPEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVI LAMDHCGEPSPDSWSLRLKKQSCEGS 599 

Qy 600 WE-DVLHLGEEAPSHLYYCQLEASACWFTEQLGRFALVGEALSVAAAKRLKLLLFAPVA 658 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 600 WEQDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALSVAAAKRLKLLLFAPVA 659 

Qy 659 CTSLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSS 718 

I I I I I I I I I I I I I I I I I I I I I I I I I I hi I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I 
Db 660 CTSLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSS 719 

Qy 719 LWKSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWVWQVEGDGQSFSI 778 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 720 LWKSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWVWQVEGDGQSFSI 779 

Qy 779 NFNITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISSLDPPCRRGADWRTLAQ 838 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 780 NFNITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISSLDPPCRRGADWRTLAQ 839 

Qy 839 KLHLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVSEAEC 898 

I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I II I I I I II I II I I I I I I I I I I 
Db 840 KLHLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVSEAEC 899 



RESULT 4 
US-09-933-261-5 

; Sequence 5, Application US/09933261 
; Publication No. US20030040046A1 
GENERAL INFORMATION: 

APPLICANT: Tessier-Lavigne, Marc 
; Leonardo, E. David 

; Hink, Lindsay 

; Masu, Masayuki 

; Kazuko, Keino-Masu 

TITLE OF INVENTION: Netrin Receptors 

NUMBER OF SEQUENCES: 8 



CORRESPONDENCE ADDRESS: 

ADDRESSEE: SCIENCE & TECHNOLOGY LAW GROUP 
STREET: 268 BUSH STREET, SUITE 3200 
CITY: SAN FRANCISCO 
STATE: CALIFORNIA 
; COUNTRY: USA 

ZIP: 94104 
; COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 
SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/933,261 
; FILING DATE: 20-Aug-2001 

; CLASSIFICATION: <Unknown> 

PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/808,982 
FILING DATE: <Unknown> 
ATTORNEY/AGENT INFORMATION: 
NAME: OSMAN, RICHARD A 
REGISTRATION NUMBER: 36,627 
REFERENCE/ DOCKET NUMBER: UC96-217 
TELECOMMUNICATION INFORMATION: 
; TELEPHONE: (415) 343-4341 

; TELEFAX: (415) 343-4342 

INFORMATION FOR SEQ ID NO: 5: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 898 amino acids 
; TYPE: amino acid 

STRANDEDNESS: No. US20030040046A1 Relevant 
TOPOLOGY: No. US20030040046A1 Relevant 
MOLECULE TYPE: peptide 
; SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

US-09-933-261-5 

Query Match 96.8%; Score 4638; DB 10; Length 898; 

Best Local Similarity 96.0%; Pred. No. 0; 

Matches 8 62; Conservative 17; Mismatches 19; Indels 0; Gaps 0 



Qy 1 MAVRPGLWPALLGIVXAAWLRGSGAQQSATVANPVPGANPDLLPHFLVEPEDVYIVKNKP 60 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1 MAVRPGLWPVLLGIVLAAWLRGSGAQQSATVANPVPGANPDLLPHFLVEPEDVYIVKNKP 60 

Qy 61 VLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLE 120 

I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 

Db 61 VL LVC KAVP AT Q I F FKCN GEWVRQ VDH VI ERSTDSSSGLP TME VR I N VS RQQ VE KVFGL E 120 

Qy 121 EYWCQCVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAE 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I I I I II I I I I I I I I | | | | I I I I I 

Db 121 EYWCQCVAWSSSGTTKSQKAYIRIAYLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAE 180 

Qy 181 VEW L RN E D LVD P S L D PN VY I T REH S L WRQARLADT AN YT C VAKN I VARRR S AS AAVI VY 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 VEWL RN E D LVD P S L D P N VY I T R EH S L WRQARLADT AN YT C VAKN I VARRR S T S AAVI VY 24 0 

Qy 241 VNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGS 300 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 241 VNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGS 300 

Qy 301 WSPWSKWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSASGPEDVA 360 

II I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I : I I I I I I I I I I I I I : I : I I I I I II 
Db 301 WSSWSKWSACGLDCTHWRSRECSDPAPRNGGEECRGADLDTRNCTSDLCLHTASCPEDVA 3 60 

Qy 361 LYVGLIAVAVCLVLLLLVLILVYCRKKEGLDSDVADSSILTSGFQPVSIKPSKADNPHLL 420 

11:11:111111 I I I I I I : I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 

Db 361 LYI GLVAVAVCLFLLLLALGLI YCRKKEGLDS DVADS S I LT S GFQPVS I KP S KADNPHLL 420 

Qy 421 TIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSSPTSEAEEFVS 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I : I I I 
Db 421 TIQPDLSTTTTTYQGSLCSRQDGPSPKFQLSNGHLLSPLGSGRHTLHHSSPTSEAEDFVS 480 

Qy 481 RLSTQNYFRSLPRGTSNMTYGTFNFLGGRLMIPNTGISLLIPPDAIPRGKIYEIYLTLHK 540 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I II I I I I I I I I I I I I 

Db 481 RLSTQNYFRSLPRGTSNMAYGTFNFLGGRLMI PNTGI SLLI PPDAI PRGKI YEI YLTLHK 540 

Qy 541 PEDVRLPIAGCQTLLSPIVSCGPPGVLLTRPVIIJ^DHCGEPSPDSWSLRLKKQSCEGSW 600 

I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I II I I I I I II I I I I I I I I I I I I I II I I I I 
Db 541 PEDVRLPLAGCQTLLSPWSCGPPGVLLTRPVILAMDHCGEPSPDSWSLRLKKQSCEGSW 600 

Qy 601 EDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALSVAAAKRLKLLLFAPVACT 660 

I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I 
Db 601 EDVLHLGEESPSHLYYCQLEAGACYVFTEQLGRFALVGEALSVAATKRLRLLLFAPVACT 660 

Qy 661 SLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLW 720 

I I I I I I I I I I I I I M I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 661 SLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLW 720 

Qy 721 KSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWVWQVEGDGQSFSINF 780 

I I I I II I I I I I I I I I I I I I I I : I I I I I I I I I I : : I I I I I I I I : I I I I I I I I I I I I : I I I 
Db 721 KSKLLVSYQEIPFYHIWNGTQQYLHCTFTLERINASTSDLACKVWVWQVEGDGQSFNINF 7 80 

Qy 781 NITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISSLDPPCRRGADWRTLAQKL 840 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I : I I I I I I I I I I I I I I I I I I 
Db 781 NITKDTRFAELLALESEGGVPALVGPSAFKIPFLIRQKIIASLDPPCSRGADWRTLAQKL 840 

Qy 841 HLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVSEAEC 898 

I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 
Db 841 HLDSHLS FFAS KPS PTAMI LNLWEARHFPNGNLGQLAAAVAGLGQPDAGLFTVS EAEC 898 

RESULT 5 

US-09-970-944-13 

Sequence 13, Application US/09970944 
Publication No. US20030204052A1 
GENERAL INFORMATION: 
APPLICANT: Herrman, John L 
APPLICANT: Rastelli, Luca 
APPLICANT: Shimkets, Richard A 

TITLE OF INVENTION: No. US20030204052Alel Proteins and Nucleic Acids Encoding 
Same and 

TITLE OF INVENTION : Antibodies Directed Against these Proteins 
FILE REFERENCE: 21402-138 

CURRENT APPLICATION NUMBER: US/09/970, 944 



CURRENT FILING DATE: 2 002-05-02 
PRIOR APPLICATION NUMBER: 60/237,862 
PRIOR FILING DATE: 2000-10-04 
NUMBER OF SEQ ID NOS : 62 
SOFTWARE: Patentln Ver. 2.1 
SEQ ID NO 13 
LENGTH: 898 
TYPE: PRT 

ORGANISM: Rattus norvegicus 
US-09-970-944-13 

Query Match 96.8%; Score 4638; DB 11; Length 898; 

Best Local Similarity 96.0%; Pred. No. 0; 

Matches 862; Conservative 17; Mismatches 19; Indels 0; Gaps 0; 

Qy 1 MAVRPGLWPALLGIVLAAWLRGSGAQQSATVANPVPGANPDLLPHFLVEPEDVYIVKNKP 60 

I I I I II I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MAVRPGLWPVLLGIVLAAWLRGSGAQQSATVANPVPGANPDLLPHFLVEPEDVYIVKNKP 60 

Qy 61 VLLVCKAVPATQ I FFKCNGEWVRQVDHVI ERS T DG S S GL PTMEVRI NVS RQQVEKVFGLE 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 VLLVCKAVPATQI FFKCNGEWVRQVDHVI ERSTDSSSGLPTMEVRINVS RQQVEKVFGLE 120 

Qy 121 EYWCQCVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAE 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 EYWCQCVAWSSSGTTKSQKAYIRIAYLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAE 180 

Qy 181 VT:WLRNEDLVDPSLDPNWITREHSLWRQARLADT7^YTCVAKNIVARRRSASAAVIVT 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 VEWL RN E D LVD P S L D PN VY I T REH S L WRQ ARLADT AN YT CVAKN I VARRRS T S AAVI VY 240 

Qy 241 VNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGS 300 

I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 241 VNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGS 300 

Qy 301 WSPWSKWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSASGPEDVA 360 

II I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I : I I I I I I I I I I I I I : I : I I I I I I I 

Db 301 WSSWSKWSACGLDCTHWRSRECSDPAPRNGGEECRGADLDTRNCTSDLCLHTASCPEDVA 360 

Qy 361 LYVGLI AVAVCLVLLLLVLI LVYCRKKEGLDSDVADS S I LTSGFQPVS I KP S KADNPHLL 420 

I I : I I : I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 LYIGLVAVAVCLFLLLLALGLIYCRKKEGLDSDVADS SI LTSGFQPVS I KPSKADNPHLL 420 

Qy 421 TIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSSPTSEAEEFVS 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I 1 I : I I I 
Db 421 TIQPDLSTTTTTYQGSLCSRQDGPSPKFQLSNGHLLSPLGSGRHTLHHSSPTSEAEDFVS 4 80 

Qy 481 RLSTQNYFRSLPRGTSNMTYGTFNFLGGRLMI PNTGI SLLI PPDAI PRGKI YEI YLTLHK 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 481 RLSTQNYFRSLPRGTSNMAYGTFNFLGGRLMI PNTGI SLLI PPDAI PRGKI YEI YLTLHK 540 

Qy 541 PEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVILAMDHCGEPSPDSWSLRLKKQSCEGSW 600 

I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 
Db 541 PEDVRLPLAGCQTLLSPWSCGPPGVLLTRPVILAMDHCGEPSPDSWSLRLKKQSCEGSW 600 

Qy 601 EDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALSVAAAKRLKLLLFAPVACT 660 

I I I I I I I I I : II II I I I I I I | I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I 



Db 601 EDVLHLGEESPSHLYYCQLEAGACYVFTEQLGRFALVGEALSVAATKRLRLLLFAPVACT 660 

Qy 661 SLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLW 72 0 

I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 661 SLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLW 72 0 

Qy 721 KSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWVWQVEGDGQSFSINF 780 

I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I : : I I I I I I I I : I I I I I I I I I I I I : I I I 
Db 721 KSKLLVSYQEIPFYHIWNGTQQYLHCTFTLERINASTSDLACKVWWQVEGDGQSFNINF 780 

Qy 781 NITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISSLDPPCRRGADWRTLAQKL 840 

M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I 
Db 781 NITKDTRFAELLALESEGGVPALVGPSAFKIPFLIRQKIIASLDPPCSRGADWRTLAQKL 840 

Qy 841 HLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVSEAEC 898 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 841 HLDSHLSFFASKPSPTAMILNLWEARHFPNGNLGQLAAAVAGLGQPDAGLFTVSEAEC 898 



RESULT 6 
US-10-256-702-5 

; Sequence 5, Application US/10256702 
; Publication No. US20030059859A1 
GENERAL INFORMATION: 

APPLICANT: Tessier-Lavigne, Marc 
; Leonardo, E . David 

; Hink, Lindsay 

; Masu, Masayuki 

; Kazuko, Keino-Masu 

; TITLE OF INVENTION: Netrin Receptors 

NUMBER OF SEQUENCES: 8 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: SCIENCE & TECHNOLOGY LAW GROUP 
STREET: 268 BUSH STREET, SUITE 3200 
CITY: SAN FRANCISCO 
STATE: CALIFORNIA 
COUNTRY: USA 
ZIP: 94104 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/10/256, 702 

FILING DATE: 27-Sep-2002 

CLASSIFICATION: <Unknown> 
; PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US/09/933, 261 

FILING DATE: 20~Aug-2001 

APPLICATION NUMBER: 08/808,982 

FILING DATE: <Unknown> 
ATTORNEY/AGENT INFORMATION: 

NAME: OSMAN, RICHARD A 

REGISTRATION NUMBER: 36,627 

REFERENCE/DOCKET NUMBER: UC96-217 
TELECOMMUNICATION INFORMATION: 



TELEPHONE: (415) 343-4341 
TELEFAX: (415) 343-4342 
INFORMATION FOR SEQ ID NO: 5: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 8 98 amino acids 
; TYPE: amino acid 

STRANDEDNESS: No. US20030059859A1 Relevant 
TOPOLOGY: No. US20030059859A1 Relevant 
MOLECULE TYPE: peptide 
SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
US-10-256-702-5 

Query Match 96.8%; Score 4638; DB 14; Length 898; 

Best Local Similarity 96.0%; Pred. No. 0; 

Matches 862; Conservative 17; Mismatches 19; Indels 0; Gaps 0; 

Qy 1 MAVRPGLWPALLGIVLAAWLRGSGAQQSATVANPVPGANPDLLPHFLVEPEDVYIVKNKP 60 

I I 1 I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I 
Db 1 MAVRPGLWPVLLGIVLAAWLRGSGAQQSATVANPVPGANPDLLPHFLVEPEDVYIVKNKP 60 

Qy 61 VLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLE 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 
Db 61 VLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDSSSGLPTMEVRINVSRQQVEKVFGLE 120 

Qy 121 EYWCQCVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAE 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 EYWCQCVAWSSSGTTKSQKAYIRIAYLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAE 180 

Qy 181 VEWLRNEDLVDPSLDPNWITREHSLWRQ7VRLADTANYTCVAKNIVARRRSASAAVIVY 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I 
Db 181 VEWLRNEDLVDPSLDPNVYITREHSLWRQARLADTANYTCVAKNIV7VRRRSTSAAVIVY 240 

Qy 241 VNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGS 300 

I I I I I I I I I I M I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 241 VNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGS 300 

Qy 301 WSPWSKWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSASGPEDVA 360 

II II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I : I : I I I I I I I 

Db 301 WSSWSKWSACGLDCTHWRSRECSDPAPRNGGEECRGADLDTRNCTSDLCLHTASCPEDVA 360 

Qy 361 LYVGLIAVAVCLVLLLLVLILVyCRKKEGLDSDVTlDSSILTSGFQPVSIKPSKADNPHLL 420 

I I : I I : I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 LYIGLVAVAVCLFLLLLALGLIYCRKKEGLDSDVADSSILTSGFQPVSIKPSKADNPHLL 420 

Qy 421 TIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSSPTSEAEEFVS 480 

I I I I I I I I I I I I I I I I I I I I I I II I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I : I I I 
Db 421 TIQPDLSTTTTTYQGSLCSRQDGPSPKFQLSNGHLLSPLGSGRHTLHHSSPTSEAEDFVS 480 

Qy 481 RLSTQNYFRSLPRGTSNMTYGTFNFLGGRLMIPNTGISLLIPPDAIPRGKIYEIYLTLHK 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 481 RLSTQNYFRSLPRGTSNMAYGT FNFLGGRLMIPNTGISLLIPPDAIPRGKIYEIYLTLHK 540 

Qy 541 PEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVI LAMDHCGEPSPDSWSLRLKKQSCEGSW 600 

I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 541 PEDVRLPLAGCQTLLSPWSCGPPGVLLTRPVI LAMDHCGEPSPDSWSLRLKKQSCEGSW 6O0 



Qy 



601 EDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALSVAAAKRLKLLLFAPVACT 660 



Db 


601 


Qy 


661 


Db 


661 


Qy 


721 


i-JU 


1 J. 


Qy 


781 


Db 


781 


Qy 


841 


Db 


841 



I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 111:111111111 



SLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLW 720 

I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I 11 I I I I I I I I I I I I I II I I I I I I I I I I I I I I 

SLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLW 720 



I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I : : I I I I I I I I : I I I I I I I I I I I I : I I I 



NITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISSLDPPCRRGADWRTLAQKL 840 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I 
NITKDTRFAELLALESEGGVPALVGPSAFKIPFLIRQKIIASLDPPCSRGADWRTLAQKL 840 

HLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVSEAEC 898 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I 
HLDSHLSFFASKPSPTAMILNLWEARHFPNGNLGQLAAAVAGLGQPDAGLFTVSEAEC 898 



RESULT 7 

US-10-240-154-16 

; Sequence 16, Application US/10240154 

; Publication No. US2003017574 1A1 

; GENERAL INFORMATION: 

; APPLICANT: Cochran et al . 

; TITLE OF INVENTION: SCHIZOPHRENIA RELATED GENES 

; FILE REFERENCE: CKFW-P01-006 

; CURRENT APPLICATION NUMBER: US/ 10/2 4 0 , 154 

; CURRENT FILING DATE: 2001-04-02 

; PRIOR APPLICATION NUMBER: PCT/GB01/01486 

; PRIOR FILING DATE: 2001-04-02 

; NUMBER OF SEQ ID NOS : 34 

SOFTWARE: Patentln version 3.2 
; SEQ ID NO 16 
LENGTH: 8 98 
TYPE: PRT 
; ORGANISM: Rattus sp . 
US-10-240-154-16 

Query Match 96.8%; Score 4638; DB 14; Length 898; 

Best Local Similarity 96.0%; Pred. No. 0; 

Matches 8 62; Conservative 17; Mismatches 19; Indels 0; Gaps 0; 



Qy 1 MAVRP GLW PALLG I VLAAWL RG S GAQQ SAT VAN P VP GAN P D L L P H FL VE P E DVYI VKN K P 60 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MAVRPGLWPVLLGIVLAAWLRGSGAQQSATVANPVPGANPDLLPHFLVEPEDVYIVKNKP 60 

Qy 61 VLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLE 120 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 VLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDSSSGLPTMEVRINVSRQQVEKVFGLE 120 

Qy 121 EYWCQCVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAE 180 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 EYWCQCVAWSSSGTTKSQKAYI-RIAYLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAE 180 

Qy 181 VEWL RN E DLVD P S L D PN VY I T RE H S L WRQ ARLADT AN YT C VAKN I VARRR S AS AAVI VY 24 0 



I I I I I I I I I I I ! I I I I I II I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
Db 181 VEWL RNE DLVD P S L D PNVY I T RE H S L WRQARLADT AN YT CVAKN I VARRR S T S AAVI VY 240 

Qy 241 VNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGS 300 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 241 VNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGS 300 

Qy 301 WSPWSKWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSASGPEDVA 360 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I : I : I I I I I I I 

Db 301 WSSWSKWSACGLDCTHWRSRECSDPAPRNGGEECRGADLDTRNCTSDLCLHTASCPEDVA 360 

Qy 361 LYVGLIAVAVCLVLLLLVLILVYCRKKEGLDSDVADSSILTSGFQPVSIKPSKADNPHLL 420 

I I : I I : I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 LYIGLVAVAVCLFLLLLALGLIYCRKKEGLDSDV7VDSSILTSGFQPVSIKPSK7U)NPHLL 420 

Qy 421 TIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSSPTSEAEEFVS 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I II : I I I I I I I I I I I I II I I I I I I I I I I : I I I 

Db 421 TIQPDLSTTTTTYQGSLCSRQDGPSPKFQLSNGHLLSPLGSGRHTLHHSSPTSEAEDFVS 480 

Qy 481 RLSTQNYFRSLPRGTSNMTYGTFNFLGGRLMIPNTGISLLIPPDAIPRGKIYEIYLTLHK 540 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I 
Db 481 RLSTQNYFRSLPRGTSNMAYGTFNFLGGRLMIPNTGISLLIPPDAIPRGKI YEIYLTLHK 540 

Qy 541 PEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVI LAMDHCGEPSPDSWSLRLKKQSCEGSW 600 

I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 541 PEDVRLPLAGCQTLLSPWSCGPPGVLLTRPVI LAMDHCGEPSPDSWSLRLKKQSCEGSW 600 

Qy 601 EDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALSVAAAKRLKLLLFAPVACT 660 

I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 111:1111111111 
Db 601 EDVLHLGEESPSHLYYCQLEAGACYVFTEQLGRFALVGEALSVAATKRLRLLLFAPVACT 660 

Qy 661 SLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLW 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 661 SLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLW 720 

Qy 721 KSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWVWQVEGDGQSFSINF 780 

I I I I I I I I I I I I I I II I I I I I : I I I I I I I I I I : : I I I I I I I I : I I I I I I I I I I I I : I I I 
Db 721 KSKLLVSYQEIPFYHIWNGTQQYLHCTFTLERINASTSDLACKVWVWQVEGDGQSFNINF 780 

Qy 781 NITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISSLDPPCRRGADWRTLAQKL 840 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I 

Db 781 NITKDTRF7VELLALESEGGVPALVGPSAFKIPFLIRQKIIASLDPPCSRGADWRTLAQKL 840 

Qy 841 HLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVSEAEC 898 

I I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 841 HLDSHLSFFASKPSPTAMILNLWEARHFPNGNLGQLAAAVAGLGQPDAGLFTVSEAEC 898 



RESULT 8 
US-10-311-623-1 

; Sequence 1, Application US/10311623 
; Publication No. US2004 0023244A1 
; GENERAL INFORMATION: 

; APPLICANT: INCYTE GENOMICS, INC.; GRIFFIN, Jennifer A. 
; APPLICANT: KALLICK, Deborah A. ; TRIBOULEY, Catherine M. 
; APPLICANT: YUE, Henry; NGUYEN, Danniel B. 
; APPLICANT: TANG, Y. Tom; LAL, Preeti G. 



APPLICANT : POLICKY, Jennifer L. ; AZIMZAI, Yalda 
APPLICANT: LU, Dyung Aina M. ; GRAUL, Richard C. 
APPLICANT: YAO, Monique G. ; BURFORD, Neil 
APPLICANT: HAFALIA, April J. A.; BAUGHN, Mariah R. 
APPLICANT: BANDMAN, Olga; ARVIZU, Chandra S. 
APPLICANT: YANG, Junming; XU, Yuming 
APPLICANT: GANDHI, Ameena R. ; WARREN, Bridget A. 
APPLICANT : DING, Li; SAN JANWALA , Madhusudan M. 
APPLICANT: DUGGAN, Brendan M. ; LU, Yan 
TITLE OF INVENTION: RECEPTORS 
FILE REFERENCE: PF-0793 USN 

CURRENT APPLICATION NUMBER: US/10/311,623 
CURRENT FILING DATE: 2002-12-17 
PRIOR APPLICATION NUMBER: US 01/19942 
PRIOR FILING DATE: 2001-06-21 

PRIOR APPLICATION NUMBER: US 60/214,027 %A 
PRIOR FILING DATE: 2000-06-21 ~ 
PRIOR APPLICATION NUMBER: US 60/228,045 / 
PRIOR FILING DATE: 2000-08-25 * {jp \ * 

PRIOR APPLICATION NUMBER: US 60/255, 104 *A ^ 



PRIOR FILING DATE: 2000-12-12 
NUMBER OF SEQ ID NOS : 24 
SOFTWARE: PERL Program 
SEQ ID NO 1 
LENGTH: 842 
TYPE: PRT 

ORGANISM: Homo sapiens 
FEATURE: 

NAME/ KEY : misc_f eature 

OTHER INFORMATION: Incyte ID No. US20040023244A1 6052371CD1 
US-10-311-623-1 



Query Match 92.1%; Score 4413; DB 16; Length 842; 

Best Local Similarity 93.5%; Pred. No. 0; 

Matches 840; Conservative 1; Mismatches 1; Indels 56; Gaps 1 



Qy 1 MAVRPGLWPALLGIVLAAWLRGSGAQQSATVANPVPGANPDLLPHFLVEPEDVYIVKNKP 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MAVRPGLWPALLGIvTAAWLRGSGAQQSATVANPVPG^PDLLPHFLVEPEDVYIVKNKP 60 

Qy 61 VLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLE 12 0 

I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 61 VLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLE 120 

Qy 121 EWCQdVAWsjsSC^TKSQKAYIRIARLRKNFEQEPIAKEVSLEQGIVLPCRPPEGIPPAE 180' 

I I I I I I I I I ill I I I I I I I I I I I I I I II I I II I I I I I I I I I I I I I II I I I I I I I I II I II 
Db 121 EYWCQCVAWSESGTTKSQKAYIRIAYLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAE 18 0 

Qy 181 VEW L RN E D LVD P S L D PN VY I T REH S L WRQ ARLADT AN YT C VAKN I VARRRS AS AAVI VY 240 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 VEW L RN E D L VDP S L D PN VY I T REH S L WRQ ARLADT AN YT C VAKN I VARRRS AS AAVI VY 240 



Qy 241 VNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGS 300 

I I I I 

Db 241 VDGS 244 

Qy 301 WSPWSKWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSASGPEDVA 360 



Db 


245 


Qy 


361 


Db 


305 


Qy 


421 


Db 


365 


Qy 


481 


Db 


425 


Qy 


541 


Db 


485 


Qy 


601 


Db 


545 


Qy 


661 


Db 


605 


Qy 


721 


UkJ 


UUJ 


Qy 


781 


Db 


725 


Qy 


841 


Db 


785 



I I I I I I I I I I I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I i I I I I I I I : I I I I I I I I 

WSPWSKWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHTASGPEDVA 304 

jLLLVLILWCRKKEGLDSDVADSSJlLTSGFQPVSIKPSKADNPHLL 420 

1 1 1 1 1 1 1 1 1 ill 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 nil 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 ii 

LWGLIAVAVpLVLLLLVLILWCRKKEGLDSDVADSS|ELTSGFQPVSIKPSKADNPHLL 364 

TIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSSPTSEAEEFVS 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

TIQPDLSTTTTTYQC^SLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSSPTSEAEEFVS 424 

fetSl'QNYF^LPRGTSNMTYGTFNFLGGRLMIPNTGISLLIPPDAIPRGKIYEIYLTLHK 540 
I I I I I I I I ill I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
RLSTQNYFRELPRGTSNMTYGTFNFLGGRLMIPNTGISLLIPPDAIPRGKIYEIYLTLHK 4 84 



PEDVRLPIAGCQTLLSPIVSCGPPGVLLTRPVIIJWDHCGEPSPDSWSLRLKKQSCECfSW 

1 1 1 1 1 1 1 II 1 1 1 1 1 1 II I II 1 1 1 II 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ill I 

PEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVILAMDHCGEPSPDSWSLRLKKQSCE^SW 

EDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALSVAAAKRLKLLLFAPVACT 

I I I I ! I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I II I 

EDVLHLGEEAPSHLYYCQLEASACWFTEQLGRFALVGEALSVAAAKRLKLLLFAPVACT 

SLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLW 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
SLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLW 

KSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWVWQVEGDGQSFSINF 
I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
KSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTS^AC^WVWQVEG^pSFSINF 

NITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQ 
I I I I II I I I I I II I I I I I I I I I I I I I I I I I I I II I I I 

N I T KDT RFAE L LAL E S EAGVPALVG P S AFK I P FL I RQ 



lADWRTLAQKL 



I IpSLD'PPCRMADW^ 
I I I I I I I I I I I I I I I I I I I I I I 
I MS S LDP PCRRGADWRTLAQKL 



600 



544 



660 



604 



720 



664 



780 



724 



840 



784 



898 




HLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVS 
I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I 
HLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVS 



RESULT 9 

US-09-970-944-14 

Sequence 14, Application US/09970944 
Publication No. US20030204 052A1 
GENERAL INFORMATION: 
APPLICANT: Herrman, John L 
APPLICANT: Rastelli, Luca 
APPLICANT: Shimkets, Richard A 

TITLE OF INVENTION: No. US20030204052Alel Proteins and Nucleic Acids Encoding 
Same and 

TITLE OF INVENTION: Antibodies Directed Against these Proteins 
FILE REFERENCE: 21402-138 

CURRENT APPLICATION NUMBER: US/09/970, 944 
CURRENT FILING DATE: 2002-05-02 
PRIOR APPLICATION NUMBER: 60/237,862 
PRIOR FILING DATE: 2000-10-04 
NUMBER OF SEQ ID NOS : 62 



; SOFTWARE: Patentln Ver. 2.1 
; SEQ ID NO 14 

LENGTH: 54 4 

TYPE: PRT 
; ORGANISM: Homo sapiens 
US-09-970-944-14 



Query Match 59.4%; Score 2845; DB 11; Length 544; 

Best Local Similarity 100.0%; Pred. No. 1.5e-236; 

Matches 541; Conservative 0; Mismatches 0; Indels 0; Gaps 0 



Qy 


358 


DVALYVGLIAVAVCLVLLLLVLILVYCRKKEGLDSDVADSSILTSGFQPVSIKPSKADNP 


417 






iiiiiiiiiiiiiiitiiiiij|iiriiiiiiiiiiiiitiiiitiiiiii»i 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I I I I I I 




Db 


4 


DVAL WGLI AVAVCLVLLLLVLI L VTCRKKEGLDS DVADS S I LT S GFQPVS I KP S KADN P 


63 


Qy 


418 


HLLTIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSSPTSEAEE 


477 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I I I I I I I | I | I I | 




Db 


64 


HLLTIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSSPTSEAEE 


123 


Qy 


478 


FVSRLSTQNYFRSLPRGTSNMTYGTFNFLGGRLMIPNTGISLLIPPDAIPRGKIYEIYLT 


537 






1 1 1 1 1 1 II 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 I I I I I I I I I I I I I I I I I M 




Db 


124 


FVSRLSTQNYFRSLPRGTSNMTYGTFNFLGGRLMIPNTGISLLIPPDAIPRGKIYEIYLT 


183 


Qy 


538 


LHKPEDVRLELAG.CQT.LLSPIVSCGPPGVLLTRPVILAMDHCGEPSPDSWSLRLKKQSCE 


597 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


184 


LHKPEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVI LAMDHCGEPSPDSWSLRLKKQSCE 


243 


Qy 


598 


GSWEDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALSVAA7VKRLKLLLFAPV 


657 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


244 


GSWEDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFT^LVGEALSVAAAKRLKLLLFAPV 


303 


Qy 


658 


ACTSLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPS 


717 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


304 


ACTSLEYNIRWCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPS 


363 


Qy 


718 


SLWKSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLTAA/WQVEGDGQSFS 


777 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


364 


SLWKSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWVWQVEGDGQSFS 


423 


Qy 


778 


INFNITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISSLDPPCRRGADWRTLA 


837 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


424 


INFNITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISSLDPPCRRGADWRTLA 


483 


Qy 


838 


QKLHLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVSEAE 


897 






1 1 M 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 




Db 


484 


QKLHLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVSEAE 


543 


Qy 


898 


C 898 




Db 


544 


1 

C 544 





RESULT 10 
US-09-933-261-6 

; Sequence 6, Application US/09933261 
; Publication No. US2003004004 6A1 
GENERAL INFORMATION: 



; APPLICANT: Tessier-Lavigne, Marc 

; Leonardo , E. David 

; Hink, Lindsay 

; Masu, Masayuki 

; Kazuko, Keino-Masu 

TITLE OF INVENTION: Netrin Receptors 
NUMBER OF SEQUENCES : 8 
CORRESPONDENCE ADDRESS: 

ADDRESSEE: SCIENCE & TECHNOLOGY LAW GROUP 
STREET: 268 BUSH STREET, SUITE 3200 
CITY: SAN FRANCISCO 
STATE: CALIFORNIA 
COUNTRY: USA 
ZIP : 94104 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 
COMPUTER: IBM PC compatible 
OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/09/933,261 

FILING DATE: 20-Aug-2001 

CLASSIFICATION: <Unknown> 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: 08/808,982 

FILING DATE: <Unknown> 
ATTORNEY/AGENT INFORMATION: 
; NAME: OSMAN, RICHARD A 

; REGISTRATION NUMBER: 36,627 

REFERENCE/DOCKET NUMBER: UC96-217 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: (415) 343-4341 

TELEFAX: (415) 343-4342 
; INFORMATION FOR SEQ ID NO: 6: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 557 amino acids 

TYPE: amino acid 

STRANDEDNESS: No. US20030040046A1 Relevant 
TOPOLOGY: No. US20030040046A1 Relevant 
MOLECULE TYPE: peptide 
SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
US-09-933-261-6 

Query Match 58.8%; Score 2815.5; DB 10; Length 557; 

Best Local Similarity 96.8%; Pred. No. 5.7e-234; 

Matches 539; Conservative 2; Mismatches 15; Indels 1; Gaps 1; 

Qy 343 NCTSDLCVHSASGPEDVALYVGLIAVAVCLVLLLLVLILVYCRKKEGLDSDVADSSILTS 4 02 

I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II 
Db 1 NCTSDLXVHTASGPEDVi\LYVGLIAVAVCLVLLLLVLILVYCRKKEGLDSDVADSSILTS 60 

Qy 4 03 GFQPVSIKPSKADNPHLLTIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGG 4 62 

I I II I I I I I I I I I I I I II I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 61 GFQPVSIKPSKADNPHLLTIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGG 120 

Qy 463 RHTLHHSSPTSEAEEFVSRLSTQNYFRSLPRGTSNMTYGTFNFLGGRLMIPNTGISLLIP 522 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



Db 



121 RHTLHHSSPTSEAEEFVSRLSTQNYFRSLPRGTSNMTYGTFNFLGGRLMIPNTGISLLIP 180 



Qy 523 PDAIPRGKIYEIYLTLHKPEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVILAMDHCGEP 582 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 181 PDAIPRGKIYEIYLTLHKPEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVILAMDHCGEP 240 

Qy 583 SPDSWSLRLKKQSCEGSWEDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALS 642 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I 

Db 241 SPDSWSLALKKQSCEGSWEDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALS 300 

Qy 643 VAAAKRLKLLLFAPVACTSLEYNIRWCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFK 702 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I II I I I I I I I I I I I I I I I I I II I I 

Db 301 VAAAKRLKLLLFAPVACTSLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHLX 360 

Qy 703 DSYHNLRLSIHDVPSSLWKSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLAC 762 

I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I ! I I I I I 
Db 361 DSYHNLXLSXHDVPSSLWKSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLAC 420 

Qy 763 KLWVWQVEGDGQS FS IN FNI TKDTRFAELLALES EAGVPALVGP SAFKI P FLI RQKI I S S 822 

I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 421 KLWVWQVEGDGQS FS IN FNI TKDTRFAELLALES EAGVPALVGP SAFKI P FLI RQKI I S S 480 

Qy 823 LDPPCRRGADWRTLAQKLHLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAG 882 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I II I I I 
Db 481 LDPPCRRGADWRTLAQKLHLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAG 540 

Qy 883 LGQPDAGLFT-VSEAEC 898 

I : I I I I I 
Db 541 TXPAGRWLLSQCSEAEC 557 



RESULT 11 
US-10-256-702-6 

; Sequence 6, Application US/10256702 
; Publication No. US20030059859A1 

GENERAL INFORMATION: 
; APPLICANT: Tessier-Lavigne, Marc 

; Leonardo, E. David 

; Hink, Lindsay 

; Masu, Masayuki 

Kazuko, Keino-Masu 
; TITLE OF INVENTION: Netrin Receptors 

; NUMBER OF SEQUENCES: 8 

; CORRESPONDENCE ADDRESS: 

ADDRESSEE: SCIENCE & TECHNOLOGY LAW GROUP 
; STREET: 268 BUSH STREET, SUITE 3200 

CITY: SAN FRANCISCO 

STATE: CALIFORNIA 

COUNTRY: USA 

ZIP: 94104 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/10/256,702 



FILING DATE: 27-Sep-2002 
CLASSIFICATION: <Unknown> 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: US/09/933, 261 
FILING DATE: 20-Aug-2001 
APPLICATION NUMBER: 08/808,982 
FILING DATE: <Unknown> 
ATTORNEY/AGENT INFORMATION: 
NAME: OSMAN, RICHARD A 
REGISTRATION NUMBER: 36,627 
REFERENCE/ DOCKET NUMBER: UC96-217 
TELECOMMUNICATION INFORMATION: 
TELEPHONE: (415) 343-4341 
TELEFAX: (415) 343-4342 
INFORMATION FOR SEQ ID NO: 6: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 557 amino acids 
TYPE: amino acid 

STRANDEDNESS: No. US20030059859A1 Relevant 
TOPOLOGY: No. US2 0030059859A1 Relevant 
MOLECULE TYPE: peptide 
SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
US-10-256-702-6 

Query Match 58.8%; Score 2815.5; DB 14; Length 557; 

Best Local Similarity 96.8%; Pred. No. 5.7e-234; 

Matches 539; Conservative 2; Mismatches 15; Indels 1; Gaps 1; 

N CT S D L C VH S AS G P E D VAL YVGL I AVAVC L VLL L L VL I L VYC RK KEGL D S D VAD S S I LT S 402 
I I I I I I | | : | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I M I I 
NCTSDLXVHTASGPEDVALYVGLIAVAVCLVLLLLVLILVYCRKKEGLDSDVADSSILTS 60 



Qy 


343 


Db 


1 


Qy 


403 


Db 


61 


Qy 


463 


Db 


121 


Qy 


523 


Db 


181 


Qy 


583 


Db 


241 


Qy 


643 


Db 


301 


Qy 


703 


Db 


361 



GFQPVSIKPSKADNPHLLTIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGG 

I 1 1 1 1 1 i i I 1 1 I i I I M 1 1 I 1 1 I 1 1 1 1 I I I 1 1 1 1 I 1 1 1 1 1 1 1 1 1 I 1 1 I I I 1 1 I I 

GFQPVSIKPSKADNPHLLTIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGG 



I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 



462 
120 
522 
180 



PDAIPRGKIYEIYLTLHKPEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVI LAMDHCGEP 582 

| | | | | | I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

PDAIPRGKIYEIYLTLHKPEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVILAMDHCGEP 240 

SPDSWSLRLKKQSCEGSWEDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALS 642 

| | M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

SPDSWSLALKKQSCEGSWEDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALS 300 

VAAAKRLKLLLFAPVACTSLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFK 702 

| | | I II I I I I I I I I I I I I I I I I I I M I I I I M I I I I I I I I I I I I I I I > I > I I M I I I I 

VAAAKRLKLLLFAPVACTSLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHLX 360 

DSYHNLRLSIHDVPSSLWKSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLAC 762 

| | || I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

DSYHNLXLSXHDVPSSLWKSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLAC 420 



Qy 



763 KLWVWQVEGDGQSFSINFNITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISS 822 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 421 KLWVWQVEGDGQSFSINFNITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISS 480 

Qy 823 LDPPCRRGADWRTLAQKLHLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAG 8 82 

I | I I I I | i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 481 LDPPCRRGADWRTLAQKLHLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAG 540 

Qy 883 LGQPDAGLFT-VSEAEC 898 . 

I : I I I I I 
Db 541 TXPAGRWLLSQCSEAEC 557 



RESULT 12 
US-09-970-944-15 

; Sequence 15, Application US/09970944 

; Publication No. US20030204052A1 

; GENERAL INFORMATION : 

; APPLICANT: Herrman, John L 

; APPLICANT: Rastelli, Luca 

; APPLICANT: Shimkets, Richard A 

TITLE OF INVENTION: No. US20030204052Alel Proteins and Nucleic Acids Encoding 
Same and 

; TITLE OF INVENTION: Antibodies Directed Against these Proteins 
; FILE REFERENCE: 21402-138 

; CURRENT APPLICATION NUMBER: US/ 09/ 97 0 , 94 4 

; CURRENT FILING DATE: 2002-05-02 

; PRIOR APPLICATION NUMBER: 60/237,862 

; PRIOR FILING DATE: 2000-10-04 

; NUMBER OF SEQ ID NOS : 62 

; SOFTWARE: Patentln Ver. 2.1 

; SEQ ID NO 15 

LENGTH: 931 
; TYPE: PRT 

; ORGANISM: Caenorhabditis elegans 
US-09-970-944-15 

Query Match 58.2%; Score 2787; DB 11; Length 931; 

Best Local Similarity 57.3%; Pred. No. 3.6e-231; 

Matches 522; Conservative 153; Mismatches 208; Indels 28; Gaps 9; 



Qy 9 PAL LG I VLAAW L RG S GAQQ S A TVANPVPGANPDLLPHFLVEPEDVYIVKNKPVLLVC 65 

Ml : I : I II I I : I I : I I I I I : I I I : I I I I I I I I I I 

Db 26 PAL — ALLSASGTGSAAQDDEFFHELPETFPSDPPEPLPHFLIEPEEAYIVKNKPVNLYC 83 

Qy 66 KAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLEEYWCQ 12 5 

I I I II I I : I I I I I I I I I I I : : I : I I I I I I : I I I I I I :: I I I : I I I I 

Db 84 KAS PATQIYFKCNSEWVHQKDHWDERVDETSGLIVREVSIEISRQQVEELFGPEDYWCQ 143 

Qy 126 CVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAEVEWLR 185 

I I I I I I : I I I I I : I I I : I I I III I I II I I I I I I I I I : : I I I II I I II Mill!: 
Db 144 CVAWSSAGTTKSRKAYVRIAYLRKTFEQEPLGKEVSLEQEVLLQCRPPEGIPVAEVEWLK 203 

Qy 18 6 NEDLVDPSLDPNWITREHSLWRQ7VRIADTANYTCVAKNIVARRRSASAAVIVTWGGW 245 

I I I : : I I : I I III : I : I : : : I I I I : I I I I I I I I I I I I I I : I : I : I I I I I I I I I I 
Db 204 NEDIIDPAEDRNFYITIDHNLIIKQARLSDTANYTCVAKNIVAKRKSTTATVIVYVNGGW 263 

Qy 246 STWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGSWSPWS 305 



I I I I I I I I I : : I I I I : I I I : I : I I I I I I I I I I I I I I I I : I I I M I I I I I I I I : II 

Db 2 64 STWTEWSVCNSRCGRGYQKRTRTCTNPAPLNGGAFCEGQSVQKIACTTLCPVDGRWTSWS 323 

Qy 306 KWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSASGPEDVALYVGL 365 

I I I I I : I I I I I I I I : I I I : I I I :: I I I : : I I I I I : : I : I I I I I I I : 

Db 324 KWSTCGTECTHWRRRECTAPAPKNGGKDCDGLVLQSKNCTDGLCMQAAPDSDDVALYVGI 383 

Qy 366 -IAVAVCLVLLLLVLILVYCRKKEGLDSDVADSSILTSGFQPVSIKPSKADNPHLLTIQP 424 

I I I I I I : : : I : I I : : I I : I I I I I I I I I : I I : : I I I : I 

Db 384 VI AVTVCLAI TVWALFVYRKNHRDFESDI I DS SALNGGFQPVNI KAARQD LLAVPP 440 

Qy 425 DLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSS PTSEAEEFVS 480 

II:: I : I : I I I : I I : I I I : : : : I I I : I I I 

Db 441 DLTSAAAMYRGPVYALHD-VSDKI PMTNS PI LDPLPNLKI KVYNS SGAVTPQDDLAEFSS 499 

Qy 4 81 RLS TQNYF RSLPRGT — SNMTYGTFNFLGGRLMI PNTGI SLLI PPDAI 526 

: II II: : I I I I I : I I I I I I I I : I I I : I : I I I I I II 

Db 500 KLSPQMTQSLLENEALNLKNQSLARQTDPSCTAFGTFNSLGGHLIIPNSGVSLLIPAGAI 559 

Qy 527 PRGKIYEIYLTLHKPEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVI LAMDHCGEPSPDS 586 

I : I : : I I : I : I : I : I : : I I : I I I I : I : I I I I I I I I I I I I I I I : II : I I : 
Db 560 PQGRVYEMYVTVHRKENMRPPMEDSQTLLTPWSCGPPG7VLLTRPVILTLHHCADPSTED 619 

Qy 587 WSLRLKKQSCEGSWEDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALSVAAA 64 6 

I : : I I I : : I I I I I : : I I I : I I I : I I I : : I I I : I I I I : : : I I I 

Db 62 0 WKIQLKNQAVQGQWEDVWVGEENFTTPCYIQLDAEACHILTENLSTYALVGQSTTKAAA 679 

Qy 647 KRLKLLLFAPVACTSLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYH 7 06 

I I I I I : I I : I : I I I I : I I I I I I II I I I I I I : I I I : I : I I I I : : I I : MM I I 

Db 680 KRLKLAIFGPLCCSSLEYSIRVYCLDDTQDALKEVLQLERQMGGQLLEEPKALHFKGSIH 739 

Qy 7 07 NLRLSIHDVPSSLWKSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWV 7 66 

II II I I I I : I II II II I II I II I I I I I : I : II II I I I II II : I : I : I II I I 

Db 740 NLRLSIHDIAHSLWKSKLLAKYQEIPFYHIWSGSQRNLHCTFTLERLSLNTVELVCKLCV 799 

Qy 7 67 WQVEGDGQS FS INFNITKDTRFAELLALESEAGVPALVGPSAFKI PFLI RQKI I SSLDPP 826 

I II I : II I : I : : : : : I I : : : : I M M II I I I I : I M I I 

Db 800 RQVEGEGQIFQLNCTVSEEPTGIDLPLLDPASTITTVTGPSAFSIPLPIRQKLCSSLDAP 859 

Qy 827 CRRGADWRTLAQKLHLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAGLGQP 886 

I I I II I I II : I I : I : : II : I I II : I I : M I I : : II : I II I I I I : : I : 

Db 860 QTRGHDWRMLAHKLNLDRYLNYFATKSSPTGVILDLWEAQNFPDGNLSMLAAVLEEMGRH 919 

Qy 887 DAGLFTVSEAE 897 

. I 
. i 

Db 920 ETWS LAAEGQ 930 



RESULT 13 
US-10-087-684-35 

Sequence 35, Application US/10087684 
Publication No. US20040029116A1 
GENERAL INFORMATION : 
APPLICANT: Edinger, Shlomit R. 
APPLICANT: MacDougall, John R. 
APPLICANT: Millet, Isabelle 
APPLICANT: Ellerman, Karen 



APPLICANT * 

iii- £, -1—1 JL ViU N X * 


Stone, David J. 


APPLICANT * 


Grosse, 


William M. 


APPLICANT • 

ILL X UX ^illl X • 


Leolev, 


Denise M. 


APPLICANT • 

-r\x t ill x • 


Rieger , 


Daniel K. 


APPLICANT • 


Burges s , 


Cathereine E 


APPLICANT • 


Casnian, 


Stacie, J. 


APPLICANT • 

■ZT.X X XJ X. X . 


Spytek , 


Kimberly A. 


APPLICANT • 


Boldog , 


Ferenc L. 


APPLICANT • 


Li , Li 




APPLICANT * 

AL X XJ J- VwilXN X * 


Padigaru 


, Muralidhara 


APPT.TCANT * 


Mishra, 


Vishnu 


a ppT TC ANT • 


Shenoy, 


Suresh G. 


APPLICANT : 


Rastelli 


, Luca 


APPLICANT: 


Tchernev 


, Velizar T. 


APPLICANT: 


Vernet, 


Corine A.M. 


APPLICANT: 


Zerhusen 


, Bryan D. 


APPLICANT: 


Malyanka 


r, Uriel M. 


APPLICANT: 


Guo, Xiaojia 


APPLICANT: 


Miller, 


Charles E. 


APPLICANT: 


Gangolli 


, Esha A. 



; TITLE OF INVENTION: PROTEINS AND NUCLEIC ACIDS ENCODING SAME 

; FILE REFERENCE: 21402-214 CIP 

; CURRENT APPLICATION NUMBER: US/10/087 , 684 

; CURRENT FILING DATE: 2003-03-10 

; PRIOR APPLICATION NUMBER: 60/253,834 

PRIOR FILING DATE: 2000-11-29 
; PRIOR APPLICATION NUMBER: 60/250,926 
; PRIOR FILING DATE: 2000-11-30 
; PRIOR APPLICATION NUMBER: 60/264,180 

PRIOR FILING DATE: 2001-01-25 
; PRIOR APPLICATION NUMBER: 60/274,194 
; PRIOR FILING DATE : 2001-03-08 
; . PRIOR APPLICATION NUMBER: 60/313,656 
; PRIOR FILING DATE: 2001-08-20 

PRIOR APPLICATION NUMBER: 60/327,456 

PRIOR FILING DATE: 2001-10-05 
; NUMBER OF SEQ ID NOS : 220 
; SOFTWARE: CuraSeqList version 0.1 
; SEQ ID NO 35 
LENGTH: 931 
TYPE: PRT 
; ORGANISM: Mus mus cuius 
US-10-087-684-35 

Query Match 58.2%; Score 2787; DB 12; Length 931; 

Best Local Similarity 57.3%; Pred. No. 3.6e-231; 

Matches 522; Conservative 153; Mismatches 208; Indels 28; Gaps 9; 

Qy 9 PALLGIVLAAWLRGSGAQQSA TVANPVPGANPDLLPHFLVEPEDVYIVKNKPVLLVC 65 

III : I : I I I I I : I I : I I I I I : I I I : I I I I I I I I I I 

Db 26 PAL--ALLSASGTGSAAQDDEFFHELPETFPSDPPEPLPHFLIEPEEAYIVKNKPVNLYC 83 

Qy 66 KAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLEEYWCQ 125 

I I I I I I I : I I I I I I I I I II : : I s I I I I I I : I I I I I I : : I I I : I I I I 

Db 84 KAS PATQI YFKCNSEWVHQKDHWDERVDET SGLI VREVS I EI S RQQVEELFGPEDYWCQ 143 



Qy 126 CVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAEVEWLR 185 



I I I I I I : I I II I : I I I : I I I III I I I I I I I I I I I I I :: I I I I I I I I I I I I I I I : 

Db 144 CVAWSSAGTTKSRKAYVRIAYLRKTFEQEPLGKEVSLEQEVLLQCRPPEGIPVAEVEWLK 203 

Qy 186 N E DL VD P S L D PN VY I T R E H S L WRQARLADT AN YT C VAKN I VARRR S AS AAVI VYVN G GW 245 

I I I : : I I : I I III : I : I : : : I I I I : I I I I I I I I I I I I I I : I : I : I I I I I I I I I I 
Db 204 N ED 1 1 D PAEDRN F Y I T I DHNL 1 1 KQARL S DTAN YT CVAKN I VAKRK S TT AT VI VYVNGGW 263 

Qy 24 6 STWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGSWSPWS 305 

I I I I I I I I I :: I I I I : I I I : I : I I I I I I I I I I I I I I I I : I I I II I I I I I I I I : II 

Db 2 64 STWTEWSVCNSRCGRGYQKRTRTCTNPAPLNGGAFCEGQSVQKIACTTLCPVDGRWTSWS 323 

Qy 306 KWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSASGPEDVALYVGL 365 

I I I I I : I I I I I I I I : I I I : I I I : : I I I : : I I I I I : : I : I I I I I I I : 
Db 324 KWSTCGTECTHWRRRECTAPAPKNGGKDCDGLVLQSKNCTDGLCMQAAPDSDDVALYVGI 383 

Qy 366 -IAVAVCLVLLLLVLILVYCRKKEGLDSDVADSSILTSGFQPVSIKPSKADNPHLLTIQP 424 

I I I I I I : : : I : I I : : I I : I I I I I I I I I : I I s s I I I : I 
Db 384 VIAWVCLAITVWALFVYRKNHRDFESDIIDSSALNGGFQPVNIK/^QD LLAVPP 440 

Qy 425 DLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSS PTSEAEEFVS 480 

II:: I : I : I II : I I : I I I : : : : I I I : III 
Db 441 DLTSAAAMYRGPWALHD-VSDKIPMTNSPILDPLPNLKIKVYNSSGAVTPQDDLAEFSS 499 

Qy 481 RLS TQNYF RSLPRGT — SNMT YGTFNFLGGRLMI PNTGI S LLI P PDAI 52 6 

: I I II: : I I I I I : I I I I I I I I : I I I : I : I I I I I II 

Db 500 KLSPQMTQSLLENEALNLKNQSLARQTDPSCTAFGTFNSLGGHLIIPNSGVSLLIPAGAI 559 

Qy 527 PRGKIYEIYLTLHKPEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVI LAMDHCGEPSPDS 58 6 

I : I : : I I : I : I : I : I : : I I : ' I I I I : I : I I I I I I I I I I I I I I I : II : I I : 
Db 560 PQGRVYEMYVTVHRKENMRPPMEDSQTLLTPWSCGPPGALLTRPVILTLHHCADPSTED 619 

Qy 587 WSLRLKKQSCEGSWEDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALSVAAA 64 6 

I : : I I I : : I I I I I : : I I I : I I I : I I I : : I I I : I I I I : : : I I I 
Db 620 WKIQLKNQAVQGQWEDVVWGEENFTTPCYIQLD7^EACHILTENLSTYALVGQSTTKAAA 679 

Qy 647 KRLKLLLFAPVACTSLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYH 706 

I I II I : I I : I : I I I I : I I I I I I II I I I I I I : I I I : I : I I I I : : I I : I I I I I I 

Db 680 KRLKLAI FGPLCCS SLEYS I RVYCLDDTQDALKEVLQLERQMGGQLLEEPKALHFKGS I H 739 

Qy 707 NLRLSIHDVPSSLWKSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWV 766 

I I II I I I I : I I I I I I I I I I I I I I I I I I : I : I I I I I I I I I I I : I : I : I III I 
Db 740 NLRLSIHDIAHSLWKSKLLAKYQEIPFYHIWSGSQRNLHCTFTLERLSLNTVELVCKLCV 799 

Qy 767 WQVEGDGQSFSINFNITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISSLDPP 826 

I I I.I : I I I : I : : : : : I I : : : : I I I I I I I I I I I : I I I I I 
Db 800 RQVEGEGQIFQLNCTVSEEPTGIDLPLLDPASTITTVTGPSAFSIPLPIRQKLCSSLDAP 859 

Qy 827 CRRGADWRTLAQKLHLDSHLSFFASKPSPTAMI LNLWEARHFPNGNLSQLAAAVAGLGQP 886 

II III II I I : I I : I : : I I : I III : I I : I I I I : : I I : I I I I III : : I : 
Db 860 QTRGHDWRMLAHKLNLDRYLNYFATKSSPTGVILDLWEAQNFPDGNLSMLAAVLEEMGRH 919 

Qy 887 DAGLFTVSEAE 8 97 

; : : I : 
Db 920 ETWS LAAE GQ 930 



RESULT 14 



US-09-972-211-121 

; Sequence 121, Application US/09972211 
; Publication No. US20040048245A1 
; GENERAL INFORMATION: 



APPLICANT • 

E H XJ X X • 


Shimkets , 


Richard A 


APPLICANT • 


Taupier 


Jr, Raymond J 


APPLICANT " 


Burgess , 


Catherine E 


APPLICANT • 

j\X7 XT XJ X v^xTJ. » X • 


Zerhusen 


, Bryan D 


APPLICANT • 

ill- -L XJ i. VyiVllI X * 


Mezes, Peter S 


APPT.TCANT • 


Rastelli 


, Luca 


APPT.TCANT • 

JilT IT XJ X \_*.TYIN X • 


Malyanka 


r, Uriel M 


APPT TfAMT • 

-TV XT fill V^-TViN X • 


Grosse, 


William M 


APPT.TCANT • 

./A. IT x XJ X, V^-xilN X • 


Alsobroo 


k II, John P 


APPT.TCANT • 

jrVx l ill K^i\±v x • 


Lepley, 


Denise M 


APPLICANT: 


Spytek, 


Kimberly Ann 


APPLICANT: 


Li, Li 




APPLICANT: 


Edinger, 


Shlomit 


APPLICANT: 


Gerlach, 


Valerie 


APPLICANT: 


Ellerman 


, Karen 


APPLICANT: 


MacDougall, John R 


APPLICANT: 


Gunther, 


Erik 


APPLICANT: 


Millet, 


Isabelle 


APPLICANT: 


Stone, David J 


APPLICANT: 


Smithson 


, Glennda 


APPLICANT: 


Szekeres 


Jr, Edward S 



; TITLE OF INVENTION: No. US20040048245Alel Human Proteins, Polynucleotides 
Encoding Them And 

; TITLE OF INVENTION: Methods Of Using The Same 
; FILE REFERENCE: 21402-141 

; CURRENT APPLICATION NUMBER: US/09/972,211 

; CURRENT FILING DATE: 2001-10-05 

; PRIOR APPLICATION NUMBER: 60/238,325 

; PRIOR FILING DATE: 2000-10-05 

; PRIOR APPLICATION NUMBER: 60/238,323 

PRIOR FILING DATE: 2000-10-05 

PRIOR APPLICATION NUMBER: 60/238,400 
; PRIOR FILING DATE: 2000-10-06 
; PRIOR APPLICATION NUMBER: 60/238,397 
; PRIOR FILING DATE: 2000-10-06 
; PRIOR APPLICATION NUMBER: 60/238,401 

PRIOR FILING DATE: 2000-10-06 

PRIOR APPLICATION NUMBER: 60/238,379 

PRIOR FILING DATE: 2000-10-06 
; PRIOR APPLICATION NUMBER: 60/238,402 
; PRIOR FILING DATE: 2000-10-06 
; PRIOR APPLICATION NUMBER: 30/238,384 
; PRIOR FILING DATE: 2000-10-06 

PRIOR APPLICATION NUMBER: 60/238,373 
; PRIOR FILING DATE: 2000-10-06 
; PRIOR APPLICATION NUMBER: 60/238,372 
; PRIOR FILING DATE: 2000-10-06 
; PRIOR APPLICATION NUMBER: 60/238,383 
; PRIOR FILING DATE: 2000-10-06 
; PRIOR APPLICATION NUMBER: 60/238,382 
; PRIOR FILING DATE: 2000-10-06 
; PRIOR APPLICATION NUMBER: 60/275,892 
; PRIOR FILING DATE: 2001-03-14 



PRIOR APPLICATION NUMBER: 60/296,860 
PRIOR FILING DATE: 2001-06-08 
NUMBER OF SEQ ID NOS : 198 
SOFTWARE: Patentln Ver. 2.1 
SEQ ID NO 121 
LENGTH: 931 
TYPE: PRT 

ORGANISM: Mus mus cuius 
US-09-972-211-121 

Query Match 58.2%; Score 2787; DB 12; Length 931; 

Best Local Similarity 57.3%; Pred. No. 3.6e-231; 

Matches 522; Conservative 153; Mismatches 208; Indels 28; Gaps 9; 

Qy 9 PALLGIVLAAWLRGSGAQQSA TVANPVPGANPDLLPHFLVEPEDVYIVKNKPVLLVC 65 

III : I : I I I I I : I I : I I I I I : I I I : I I I I I I I I I I 

Db 26 PAL — ALLSASGTGSAAQDDEFFHELPETFPSDPPEPLPHFLIEPEEAYIVKNKPVNLYC 83 

Qy 66 KAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLEEYWCQ 125 

I I I I I I I : I I II I I I I II I : : I : I I I I I I : I I I I I I : : I I I : I II I 
Db 84 KAS PATQI YFKCNS EWVHQKDHWDERVDET SGL I VREVS I EI S RQQVEELFGPEDYWCQ 143 

Qy 126 CVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAEVEWLR 185 

I I I I I I : I I I I I : I II : I I I III Mill! I I I I I I I : : I I I I I I I I I I I I I I I : 
Db 14 4 CVAWS SAGTTKS RKAYVRI AYLRKT FEQEPLGKEVS LEQEVLLQCRP PEGI PVAEVEWLK 2 03 

Qy 186 NEDLVDPSLDPNVTITREHSLWRQARLADTANYTCVAKNIVARRRSASAAVIWWGGW 245 

I I I : : I I : I I III : I : I : : : I I I I : I I I I I I I I I I I I I I : I : I : I I I I I I I I I I 
Db 204 NEDIIDPAEDRNFYITIDHNLIIKQTVRLSDTT^YTCVTVKNIVAKRKSTTATVIVYVNGGW 263 

Qy 24 6 STWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGSWSPWS 305 

I I I I I I I I I : : I I I I : I I I : I : I I I I I I I I I I I I I I I I : I I I II I I I I I I I I : II 

Db 264 STWTEWSVCNSRCGRGYQKRTRTCTNPAPLNGGAFCEGQSVQKIACTTLCPVDGRWTSWS 323 

Qy 306 KWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSASGPEDVALYVGL 365 

II I I I : I I I I I I I I : I I I : I I I :: I I I : : I I I I I : : I : I I I I I I I : 

Db 324 KWSTCGTECTHWRRRECTAPAPKNGGKDCDGLVLQSKNCTDGLCMQAAPDSDDVALYVGI 383 

Qy 366 -IAVAVCLVLLLLVLILVYCRKKEGLDSDVADSSILTSGFQPVSIKPSKADNPHLLTIQP 42 4 

I I I I I I : : : I : I I : : I I : I I I I I I I I I : I I : : I I I : I 
Db 384 VIAVTVCLAITVWALFVYRKNHRDFESDIIDSSALNGGFQPVNIKAARQD LLAVPP 440 

Qy 42 5 DLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSS PTSEAEEFVS 480 

II:: I : I : I II : I I : I I I : : :: I I I : III 
Db 441 DLTSAAAMYRGPWALHD-VSDKIPMTNSPILDPLPNLKIKVYNSSGAVTPQDDLAEFSS 499 

Qy 481 RLS TQNYF RSLPRGT — SNMTYGTFNFLGGRLMIPNTGISLLIPPDAI 526 

: I I II: : I I I I I : I I I I I I I I : I I I : I : I I I I I II 

Db 500 KLSPQMTQSLLENEALNLKNQSLARQTDPSCTAFGTFNSLGGHLIIPNSGVSLLIPAGAI 559 

Qy 527 PRGKIYEIYLTLHKPEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVI LAMDHCGEPSPDS 586 

I : I : : I I : I : I = I = I - I I : I I I I : I : I I I I I I I I I I I I I I I - II : I I : 
Db 560 PQGRVYEMYVTVHRKENMRPPMEDSQTLLTPWSCGPPGALLTRPVILTLHHCADPSTED 619 

Qy 587 WSLRLKKQSCEGSWEDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALSVAAA 64 6 

I : : I I I : : I I M I : : I I | : I I I : I I I :: I I I : I I I I : : : II I 
Db 62 0 WKIQLKNQAVQGQWEDVWVGEENFTTPCYIQLDAEACHILTENLSTYALVGQSTTKAAA 67 9 



Qy 647 KRLKLLLFAPVACTSLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYH 706 

Mill :| I: I : I I I I : I I I I I I II I I I I I I : I I I : I : I I I I : : I I : II I I I I 

Db 680 KRLKLAI FGPLCCSSLEYSIRVTCLDDTQDALKEVLQLERQMGGQLLEEPKALHFKGSIH 739 



Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 



707 NLRLSIHDVPSSLWKSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWV 766 

I II I I I I I : I I I I I I I I I I I I I I I I I I : I : I I I I I I I I I I I : I : I : I III I 

740 NLRLSIHDIAHSLWKSKLLAKYQEI PFYHIWSGSQRNLHCTFTLERLSLNTVELVCKLCV 799 

767 WQVEGDGQSFSINFNITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISSLDPP 826 

I I I I : I I I : I : : : : : I I : : : : I I I I I I I I I I I : I I I I I 

8 00 RQVEGEGQIFQLNCTVSEEPTGIDLPLLDPASTITTVTGPSAFSIPLPIRQKLCSSLDAP 859 

827 CRRGADWRTLAQKLHLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQIJ^AVAGLGQP 88 6 

II III I I I I : I I : I : : I I : I I II : I I : I I I I : : I I : I I I I III : : I : 

860 QTRGHDWRMLAHKLNLDRYLNYFATKSSPTGVILDLWEAQNFPDGNLSMLAAVLEEMGRH 919 

887 DAGLFTVSEAE 897 

: : : | : 
920 ETWSLAAEGQ 930 



RESULT 15 
US-10-037-417-117 

Sequence 117, Application US/10037417 
Publication No. US20040052806A1 
GENERAL INFORMATION: 
APPLICANT: Kekuda, Ramesh 
APPLICANT: Alsobrook II, John P 
APPLICANT: Tchernev, Velizar T 
APPLICANT: Liu, Xiaohong 
APPLICANT: Spytek, Kimberly A 
APPLICANT: Patturajan, Meera 
APPLICANT: Grosse, William M 
APPLICANT: Lepley, Denise M 
APPLICANT: Burgess, Catherine E 
APPLICANT: Vernet, Corine A.M. 
APPLICANT: Li, Li 
APPLICANT: Gorman, Linda 
APPLICANT: Edinger, Shlomit R 
APPLICANT : Sciore, Paul 
APPLICANT: Ellerman, Karen 
APPLICANT: Malyankar, Uriel M 
APPLICANT: Rothenberg, Mark 
APPLICANT: Stone, David J 
APPLICANT: Boldog, Ferenc L 
APPLICANT: Guo, Xiaojia 
APPLICANT: Shenoy, Suresh G 
APPLICANT: Anderson, David W 
APPLICANT: Padigaru, Muralidhara 
APPLICANT: Taupier Jr, Raymond J 
APPLICANT: Miller, Charles E 
APPLICANT: Eisen, Andrew J 

TITLE OF INVENTION: Proteins and Nucleic Acids Encoding Same 
FILE REFERENCE: 21402-235 

CURRENT APPLICATION NUMBER: US/ 10/037, 417 
CURRENT FILING DATE: 2002-09-20 



PRIOR APPLICATION NUMBER: 60/260,018 
PRIOR FILING DATE: 2001-01-05 
PRIOR APPLICATION NUMBER: 60/260,360 
PRIOR FILING DATE: 2001-01-08 
PRIOR APPLICATION NUMBER: 60/272,411 
PRIOR FILING DATE: 2001-02-28 
PRIOR APPLICATION NUMBER: 60/272,817 
PRIOR FILING DATE: 2001-03-02 
PRIOR APPLICATION NUMBER: 60/291,186 
PRIOR FILING DATE: 2001-05-15 
PRIOR APPLICATION NUMBER: 60/303,231 
PRIOR FILING DATE: 2001-07-05 
PRIOR APPLICATION NUMBER: 60/305,060 
PRIOR FILING DATE: 2001-07-12 
PRIOR APPLICATION NUMBER: 60/318,405 
PRIOR FILING DATE: 2001-09-10 
PRIOR APPLICATION NUMBER: 60/318,700 
PRIOR FILING DATE: 2001-09-12 
NUMBER OF SEQ ID NOS : 227 
SOFTWARE: Patentln Ver. 2.1 
SEQ ID NO 117 
LENGTH: 931 
TYPE: PRT 

ORGANISM: Caenorhabditis elegans 
US-10-037-417-117 

Query Match 58.2%; Score 2787; DB 12; Length 931; 

Best Local Similarity 57.3%; Pred. No. 3.6e-231; 

Matches 522; Conservative 153; Mismatches 208; Indels 28; Gaps 9 

Qy 9 PALLGIVLAAWLRGSGAQQSA TVANPVPGANPDLLPHFLVEPEDVYIVKNKPVLLVC 65 

III : I : I I I I I : I I : I I I I I : I I I : I I I I I I I I I I 

Db 26 PAL — ALLSASGTGSAAQDDEFFHELPETFPSDPPEPLPHFLIEPEEAYIVKNKPVNLYC 83 

Qy 66 KAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLEEYWCQ 125 

I I I I I I I : I I I I I I I I I I I : : I : I I I I I I : I I I I I I : : I I I : I I I I 

Db 84 KAS PATQIYFKCNSEWVHQKDHVVDERVDETSGLIVREVSIEISRQQV^ELFGPEDYWCQ 143 

Qy 126 CVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAEVEWLR 185 

I I I I I I : I I I I I : I I I : I I I III I I I I I I I I I I I I I : : I I I I I I I I I I I I I I I : 

Db 144 CVAWSSAGTTKSRKAYVRIAYLRKTFEQEPLGKEVSLEQEVLLQCRPPEGIPVAEVEWLK 203 

Qy 186 NEDLVDPSLDPNWITREHSLWRQARLADTANYTCVAKNIVARRRSASAAVIVTWGGW 245 

I I I : : I I : I I III : I : I : : : I I I I : I I I I I II I I I I I I I : I : I : I I I I I I I I I I 

Db 204 N E D 1 1 D P AE DRNFYITIDHNLII KQARL S DT AN YT C VAKN I VAK R K S T TAT VI VYVN G GW 263 

Qy 24 6 STWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGSWSPWS 305 

I I I I I I I I I : : I I I I : I I I : I : I I I I I I I I I I I I I I I I : I I I II I I I I I I I I : II 
Db 264 STWTEWSVCNSRCGRGYQKRTRTCTNPAPLNGGAFCEGQSVQKIACTTLCPVDGRWTSWS 32 3 

Qy 306 KWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSASGPEDV7VLYVGL 365 

I I I I I : I I I I I I I I : I I I : I I I :: I I I : : I I I I I : : I : I I I I I II : 
Db 324 KWSTCGTECTHWRRRECTAPAPKNGGKDCDGLVLQSKNCTDGLCMQAAPDSDDVALYVGI 383 

Qy 366 -IAVAVCLVLLLLVLILVYCRKKEGLDSDV7VDSSILTSGFQPVSIKPSKADNPHLLTIQP 424 

I I I I I I : : : I : I I : : I I : I I I I I I I I I : I I : : I I I : I 
Db 384 VI AVTVCLAIT VVVALFVYRKNHRDFESDI I DS SALNGGFQPVNI KAARQD LLAVPP 440 



Qy 425 DLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSS PTSEAEEFVS 480 

II:: I : I : I I I : I I : I I I : : : : I I I : I I I 

Db 441 DLTSAA7^MYRGPVYALHD-VSDKIPMTNSPILDPLPNLKIKVYNSSGAVTPQDDLAEFSS 499 

Qy 4 81 RLS TQNYF RSLPRGT — SNMTYGTFNFLGGRLMI PNTGI SLLI PPDAI 52 6 

: I I II: : I I I I I : I I I I I I I I : I I I : I : I I I I I II 

Db 500 KLSPQMTQSLLENEALNLKNQSLARQTDPSCTAFGTFNSLGGHLIIPNSGVSLLIPAGAI 559 

Qy 527 PRGKIYEIYLTLHKPEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVIU\MDHCGEPSPDS 586 

I : I : : II : I : I : I : I : : I I : I I I I : I : I I I I I I I I I I I I I I I : II : I I : 
Db 560 PQGRVYEMYVTVHRKENMRPPMEDSQTLLTPWSCGPPGALLTRPVILTLHHCADPSTED 619 

Qy 587 WSLRLKKQSCEGSWEDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALSVAAA 64 6 

I : : I I I : : I I I I I : : I I I : I I I : I I I : : I I I : I I I I : : : I I I 
Db 620 WKIQLKNQAVQGQWEDWWGEENFTTPCYIQLDAEACHILTENLSTYALVGQSTTKAAA 67 9 

Qy 647 KRLKLLLFAPVACTSLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYH 706 

I I I I I : I I : I : I I I I : I I I I I I II I I I I I I : I I I : I : I I I I : : I I : I I I I I I 

Db 680 KRLKLAIFGPLCCSSLEYSIRVYCLDDTQDALKEVLQLERQMGGQLLEEPKALHFKGSIH 739 

Qy 707 NLRLSIHDVPSSLWKSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWV 766 

I I I I I I I I : I I I I I I I I I I I I I I I I I I : I : I I I I I I I I I I I . : I : I : I III I 
Db 740 NLRLSIHDIAHSLWKSKLLAKYQEIPFYHIWSGSQRNLHCTFTLERLSLNTVELVCKLCV 799 

Qy 767 WQVEGDGQSFSINFNITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISSLDPP 826 

I I I I : I I I : I : : : : : I I : : : : I I I I I I I I I I I : I I I I I 

Db 800 RQVEGEGQIFQLNCTVSEEPTGIDLPLLDPASTITTVTGPSAFSIPLPIRQKLCSSLDAP 859 

Qy 827 CRRGADWRTLAQKLHLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAGLGQP 886 

II III II I I : I I : I : : I I : I III : I I : I I I I : : I I : I I I I III : : I : 
Db 860 QTRGHDWRMLAHKLNLDRYLNYFATKS S PTGVI LDLWEAQNFPDGNLSMLAAVLEEMGRH 919 

Qy 887 DAGLFTVSEAE 897 

: • : I : 
Db 920 ETWSLAAEGQ 930 



Search completed: July 12, 2004, 23:08:12 
Job time : 100 sees 



GenCore version 5.1.6 
Copyright (c) 1993 - 2004 Compugen Ltd. 



OM protein - protein search, using sw model 



Run on: 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



July 12, 2004, 22:26:45 



US-10-624-932-2 
4791 

1 MAVRPGLWPALLGIVLAAWL. 



Search time 92 Seconds 
(without alignments) 
3079.732 Million cell updates/sec 



. AVAGLGQPDAGLFTVSEAEC 898 



BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched: 1017041 seqs, 315518202 residues 

Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



1017041 



Database : 



1 

2 
3 
4 
5 
6 
7 
8 
9 

10 
11 
12 
13 
14 
15 
16 
17 



SPTREMBL_25:* 
sp_archea 



sp_bacteria: * 
sp_f ungi : * 
sp_human : * 
sp_invertebrate : * 
sp_mammal : * 
sp_mhc : * 
sp_organelle : * 
sp_phage: * 

sp_plant : * 

sp_rodent : * 

sp_virus : * 

sp_vertebrate : * 

sp_unclassif ied: * 

sp_rvirus : * 

sp_bacteriap : * 

sp_ar cheap: * 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 



Result Query 

No. Score Match Length DB ID 



Description 



1 


4685 


97. 


8 


898 


11 


Q8K1S4 


Q8kls4 mus musculu 


2 


4638 


96. 


8 


898 


11 


008721 


008721 rattus norv 


3 


2845 


59. 


4 


544 


4 


Q96GP4 


Q96gp4 homo sapien 


4 


2787 


58. 


2 


931 


11 


008747 


008747 mus musculu 


5 


2767.5 


57. 


8 


950 


11 


Q8CD16 


Q8cdl6 mus musculu 


6 


2761 


57. 


6 


931 


13 


Q7T2Z5 


Q7t2z5 gallus gall 


7 


2755 


57. 


5 


931 


4 


095185 


095185 homo sapien 


8 


2646.5 


55. 


2 


943 


13 


Q8JGT4 


Q8jgt4 xenopus lae 


9 


2585 


54. 


0 


1008 


11 


Q80Y85 


Q80y85 mus musculu 


10 


2578.5 


53. 


8 


945 


11 


Q8K1S3 


Q8kls3 mus musculu 


11 


2578.5 


53. 


8 


945 


11 


008722 


008722 rattus norv 


12 


2572.5 


53. 


7 


945 


11 


Q9D398 


Q9d398 mus musculu 


13 


2566 


53. 


6 


934 


4 


Q8IZJ1 


Q8izjl homo sapien 


14 


2558.5 


53. 


4 


945 


4 


Q86SN3 


Q8 6sn3 homo sapien 


15 


2200 


45. 


9 


956 


11 


Q8K1S2 


Q8kls2 mus musculu 


16 


2189.5 


45. 


7 


948 


4 


Q8WYP7 


Q8wyp7 homo sapien 


17 


1668.5 


34. 


8 


597 


4 


Q8IUT0 


Q8iut0 homo sapien 


18 


1458 


30. 


4 


328 


11 


Q80T71 


Q8 0t71 mus musculu 


19 


1242.5 


25. 


9 


554 


4 


Q8N1Y2 


Q8nly2 homo sapien 


20 


997 


20. 


8 


1072 


5 


Q9NBL0 


Q9nbl0 drosophila 


21 


992 


20. 


7 


1072 


5 


Q9V7B5 


Q9v7b5 drosophila 


22 


981.5 


20. 


5 


366 


4 


Q9H9F3 


Q9h9f3 homo sapien 


23 


980 


20. 


5 


947 


5 


Q26262 


Q2 62 62 caenorhabdi 


24 


977 


20. 


4 


947 


5 


044171 


044171 caenorhabdi 


25 


692 


14. 


4 


199 


13 


Q9PVD5 


Q9pvd5 petromyzon 


26 


552.5 


11. 


5 


351 


4 


Q8TF26 


Q8tf26 homo sapien 


27 


377.5 


7. 


9 


2673 


4 


Q96SC3 


Q96sc3 homo sapien 


28 


377.5 


7. 


9 


5636 


4 


Q96RW7 


Q96rw7 homo sapien 


29 


318 


6. 


6 


325 


5 


Q8I1K1 


Q8ilkl drosophila 


30 


300 


6. 


3 


518 


4 


Q8IV45 


Q8iv45 homo sapien 


31 


293 


6. 


1 


1172 


11 


Q8CG21 


Q8cg21 mus musculu 


32 


293 


6. 


1 


1172 


11 


Q7TMT3 


Q7tmt3 mus musculu 


33 


292 


6. 


1 


1582 


11 


Q8CGM0 


Q8cgm0 mus musculu 


34 


286 


6. 


0 


1081 


5 


Q9U631 


Q9u631 drosophila 


35 


285 


5. 


9 


1083 


5 


Q9VTT0 


Q9vtt0 drosophila 


36 


285 


5. 


9 


1091 


5 


Q7YU67 


Q7yu67 drosophila 


37 


276 


5. 


8 


1461 


5 


Q8MYA8 


Q8mya8 caenorhabdi 


38 


275.5 


5. 


8 


1122 


11 


Q7TT33 


Q7tt33 mus musculu 


39 


275 


5. 


7 


1522 


11 


Q80ZF8 


Q80zf8 mus musculu 


40 


274.5 


5. 


7 


1573 


4 


Q8NGW8 


Q8ngw8 homo sapien 


41 


273.5 


5. 


7 


478 


11 


Q8BVE5 


Q8bve5 mus musculu 


42 


271.5 


5. 


7 


685 


6 


Q9TTS5 


Q9tts5 bos taurus 


43 


271.5 


5. 


7 


5146 


6 


Q8SPM4 


Q8spm4 bos taurus 


44 


271 


5. 


7 


1560 


11 


Q8CGM1 


Q8cgml mus musculu 


45 


270.5 


5. 


6 


1171 


11 


Q8CGB2 


Q8cgb2 mus musculu 



ALIGNMENTS 



RESULT 1 
Q8K1S4 

ID Q8K1S4 PRELIMINARY; PRT; 898 AA. 

AC Q8K1S4; 

DT 01-OCT-2002 (TrEMBLrel. 22, Created) 

DT 01-OCT-2002 (TrEMBLrel. 22, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 



DE Netrin receptor Unc5hl. 

GN UNC5H1 . 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Engelkamp D. ; 

RT "Cloning of three mouse unc-5 genes and their expression patterns at 

RT mid-gestation."; 

RL Submitted (MAY-2002) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; AJ487852; CAD32250.1; 

DR MGD; MGI: 894682; UncShl . 

DR GO; GO: 0004872; F: receptor activity; IEA. 

DR GO; GO: 0007165; P: signal transduction; IEA. 

DR InterPro; IPR000488; Death. 

DR InterPro; IPR003599; Ig. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR000884; TSPl. 

DR InterPro; IPR008085; TSP_1. 

DR InterPro; IPR000906; ZU5 . 

DR Pfam; PF00531; death; 1. 

DR Pfam; PF00047; ig; 1. 

DR Pfam; PF00090; tsp_l; 2. 

DR Pfam; PF00791; ZU5 ; 1. 

DR PRINTS; PR01705; TSP1REPEAT. 

DR SMART; SM00005; DEATH; 1. 

DR SMART; SM00409; IG; 1. 

DR SMART; SM00209; TSPl; 2. 

DR SMART; SM00218; ZU5; 1. 

DR PROSITE; PS50835; IG_LIKE; 1. 

DR PROSITE; PS50092; TSPl; 2. 

KW Receptor. 

SQ SEQUENCE 898 AA; 98856 MW; 59F04BA2E196C1DB CRC64; 

Query Match 97.8%; Score 4685; DB 11; Length 898; 

Best Local Similarity 96.7%; Pred. No. 0; 

Matches 868; Conservative 19; Mismatches 11; Indels 0; Gaps 0 
Qy 1 MAVRPGLWPALLGIVT^AWLRGSGAQQSATVANPVPGANPDLLPHFLVEPEDVYIVKNKP 60 



Db 



I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

1 MAVRPGLWPALLGIVLTAWLRGSGAQQSATVT^JPVPGANPDLLPHFLVEPEDVYIVKNKP 60 



Qy 



61 VLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLE 120 




Db 



61 VLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLE 120 



Qy 



121 EYWCQCVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAE 180 





Db 



121 EYWCQCVAWSSSGTTKSQKAYIRIAYLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAE 180 



Qy 



181 VEWLRNEDLVDPSLDPNVTITREHSLWRQARIADTANYTCVAKNIVTVRRRSAST^AVIVY 240 




Db 



181 VEWLRNEDLVDPSLDPNVTITREHSLVVRQARLADTANYTCVAKNIVARRRSASAAVIVY 24 0 



Qy 



241 VNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGS 300 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

Db 241 VNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGS 300 

Qy 301 WSPWSKWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSASGPEDVA 360 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I : I I II I I I I I I I I I : I : : I I I I I I I 
Db 301 WSPWSKWSACGLDCTHWRSRECSDPAPRNGGEECRGADLDTRNCTSDLCLHTSSGPEDVA 360 

Qy 361 LYVGLIAVAVCLVLLLLVLILVYCRKKEGLDSDVADSSILTSGFQPVSIKPSKADNPHLL 420 

I I : I I : I I I I I I : I I I I I I : I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 361 LYIGLVAVAVCLILLLLVLVLIYCRKKEGLDSDVADSSILTSGFQPVSIKPSKADNPHLL 420 

Qy 421 TIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSSPTSEAEEFVS 480 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I II I I I I : I I I 
Db 421 TIQPDLSTTTTTYQGSLCPRQDGPSPKFQLSNGHLLSPLGSGRHTLHHSSPTSEAEDFVS 480 

Qy 481 RLSTQNYFRSLPRGTSNMTYGTFNFLGGRLMIPNTGISLLIPPDAIPRGKIYEIYLTLHK 540 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I 
Db 481 RLSTQNYFRSLPRGTSNMAYGT FNFLGGRLMIPNTGISLLIPPDAIPRGKIYEIYLTLHK 540 

Qy 541 PEDVRLPIAGCQTLLSPIVSCGPPGVLLTRPVILAMDHCGEPSPDSWSLRLKKQSCEGSW 600 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 541 PEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVILAMDHCGEPSPDSWSLRLKKQSCEGSW 600 

Qy 601 EDVLHLGEEAPSHLYYCQLEASACWFTEQLGRFALVGEALSVAAAKRLKLLLFAPVACT 660 

I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 111:1111111111 
Db 601 EDVLHLGEESPSHLYYCQLEAGACYVFTEQLGRFALVGEALSVAATKRLRLLLFAPVACT 660 

Qy 661 SLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLW 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 661 SLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLW 720 

Qy 721 KSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWVWQVEGDGQSFSINF 780 

I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I : I I I I I I I I : I I I I I I I I I I I I : I I I 

Db 721 KSKLLVSYQEIPFYHIWNGTQQYLHCTFTLERWASTSDIACKVWVWQVEGDGQSFNINF 780 

Qy 781 NITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISSLDPPCRRGADWRTLAQKL 840 

I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I 
Db 781 NITKDTRFAEMLALESEGGVPALVGPSAFKIPFLIRQKIITSLDPPCSRGADWRTLAQKL 840 

Qy 841 HLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVSEAEC 898 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 841 HLDSHLSFFASKPSPTAMILNLWEARHFPNGNLGQLAAAVAGLGQPDAGLFTVSEAEC 898 



RESULT 2 
008721 

ID 008721 PRELIMINARY; PRT; 898 AA. 

AC 008721; 

DT 01-JUL-1997 (TrEMBLrel. 04, Created) 

DT 01-JUL-1997 (TrEMBLrel. 04, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Transmembrane receptor UNC5H1. 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Rattus. 

OX NCBI_TaxID-10116; 

RN [1] 



RP SEQUENCE FROM N.A. 

RC TISSUE=Brain, and Ventral spinal cord; 

RX MEDLINE=97271897; PubMed-9126742 ; 

RA Leonardo E.D., Hinck L., Masu M. , Keino-Masu K. , Ackerman S.L., 

RA Tessier-Lavigne M. ; 

RT "Vertebrate homologues of C. elegans UNC-5 are candidate netrin 

RT receptors."; 

RL Nature 386:833-838(1997). 

DR EMBL; U87305; AAB57678.1; -. 

DR GO; GO: 0004872; F: receptor activity; IEA. 

DR GO; GO: 0007165; P: signal transduction; IEA. 

DR InterPro; IPR000488; Death. 

DR InterPro; IPR003599; Ig. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR000884; TSP1. 

DR InterPro; IPR008085; TSP_1. 

DR InterPro; IPR000906; ZU5 . 

DR Pfam; PF00531; death; 1. 

DR Pfam; PF00047; ig; 1. 

DR Pfam; PF00090; tsp_l; 2. 

DR Pfam; PF00791; ZU5; 1. 

DR PRINTS; PR01705; TSP1REPEAT. 

DR SMART; SM00005; DEATH; 1. 

DR SMART; SM00409; IG; 1. 

DR SMART; SM00209; TSP1; 2. 

DR SMART; SM00218; ZU5; 1. 

DR PROSITE; PS50835; IG_LIKE; 1. 

DR PROSITE; PS50092; TSP1; 2. 

KW Receptor. 

SQ SEQUENCE 898 AA; 98840 MW; 7A3CBCB9E7ACA135 CRC64; 

Query Match 96.8%; Score 4638; DB 11; Length 898; 

Best Local Similarity 96.0%; Pred. No. 0; 

Matches 862; Conservative 17; Mismatches 19; Indels 0; Gaps 0 

Qy 1 MAVT^PGLWPALLGIVljAAWLRGSGAQQSATVANPVPGANPDLL 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 1 MAWPGLWPVTjLGIVLAAWLRGSGAQQSATVANPVPGANPDLLPHFLVEPEDVYIVKNKP 60 

Qy 61 VLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLE 120 





Db 



61 VLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDSSSGLPTMEVRINVSRQQVEKVFGLE 120 



Qy 



121 EYWCQCVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAE 180 





Db 



121 EYWCQCVAWSSSGTTKSQKAYIRIAYLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAE 180 



Qy 



181 VEWLRNEDLVDP S LDPNVYI TREH S LWRQARLADTAN YTCVAKN I VARRRS AS AAVI VY 240 




Db 



181 VEWLRNEDLVDPSLDPNVYITREHSLWRQARLADTANYTCVAKNIVARRRSTSAAVIVY 240 



Qy 



241 VNGGWSTWTEWSVCSASCGRGWQKRS R S CTN PAP LN GGAFC E GQN VQ KT AC AT L C P VDG S 300 




Db 



241 VNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGS 300 



Qy 



301 WSPWSKWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSASGPEDVA 360 
II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I II I I I I I I I I I I : I : I I I I I I I 



Db 


301 


WSSWSKWSACGLDCTHWRSRECSDPAPRNGGEECRGADLDTRNCTSDLCLHTASCPEDVA 


360 


Qy 


obi 


Jj i V^LilAVAV UJj VJjliijijVijl Li V I ^KKKbbLDSDVADSSILTSGFQPVSIKPSKADNPHLL 


420 






11:11:111111 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | | | | | | 




Db 


361 


LYIGLVAVAVCLFLLLLALGLIYCRKKEGLDSDVADSSILTSGFQPVSIKPSKADNPHLL 


420 


Qy 


/ion 


TIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSSPTSEAEEFVS 


480 






M 1 II Ill: I I I I 1 1 II 1 1 1 1 | | : | | | 




Db 


421 


TIQPDLSTTTTTYQGSLCSRQDGPSPKFQLSNGHLLSPLGSGRHTLHHSSPTSEAEDFVS 


480 


Qy 


4 o 1 


RLSTQNYFRSLPRGTSNMTYGTFNFLGGRLMIPNTGISLLIPPDAIPRGKIYEIYLTLHK 


540 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I I I I I I I 




Db 


481 


RLSTQNYFRSLPRGTSNMAYGTFNFLGGRLMI PNTGI SLLI PPDAI PRGKI YEI YLTLHK 


540 


Qy 


541 


PEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVI LAMDHCGEPSPDSWSLRLKKQSCEGSW 


600 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | | | | | | 




Db 


541 


PEDVRLPLAGCQTLLSPWSCGPPGVLLTRPVI LAMDHCGEPSPDSWSLRLKKQSCEGSW 


600 


Qy 


601 


EDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALSVAAAKRLKLLLFAPVACT 


660 






1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 111:1111111111 




Db 


601 


EDVLHLGEESPSHLYYCQLEAGACYVFTEQLGRFALVGEALSVAATKRLRLLLFAPVACT 


660 


Qy 


DDI 


SLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLW 


720 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I 




Db 


661 


SLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLW 


720 


Qy 


/z I 


KSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWVWQVEGDGQSFSINF 


780 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 I I I : : 1 1 I I I I I I : I 1 1 1 1 1 : 1 1 1 




Db 


721 


KSKLLVSYQEIPFYHIWNGTQQYLHCTFTLERINASTSDIJVCKVWVWQVEGDGQSFNINF 


780 


Qy 


781 


NITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISSLDPPCRRG7VDWRTLAQKL 


840 


Db 


781 


1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 | | | 1 1 1 1 1 1 1 1 1 1 1 | 

NITKDTRFAELLALESEGGVPALVGPSAFKIPFLIRQKIIASLDPPCSRGADWRTLAQKL 


840 


Qy 


841 


HLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVSEAEC 898 






1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 




Db 


841 


HLDSHLS FFASKPS PTAMI LNLWEARHFPNGNLGQLAAAVAGLGQPDAGLFTVS EAEC 898 



RESULT 3 




Q96GP4 




ID 


Q96GP4 PRELIMINARY; PRT; 544 AA. 




AC 


Q96GP4; 




DT 


01-DEC-2001 (TrEMBLrel. 19, Created) 




DT 


01-DEC-2001 (TrEMBLrel. 19, Last sequence update) 




DT 


01-JUN-2003 (TrEMBLrel. 24, Last annotation update) 




DE 


Similar to transmembrane receptor Unc5Hl (Fragment) . 




OS 


Homo sapiens (Human) . 




OC 


Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; 


Euteleostomi ; 


OC 


Mammalia; Eutheria; Primates; Catarrhini; Hominidae; 


Homo . 


OX 


NCBI TaxID=9606; 




RN 


[1] 




RP 


SEQUENCE FROM N.A. 




RC 


TISSUE=Brain; 




RA 


Strausberg R. ; 




RL 


Submitted (JUN-2001) to the EMBL/ GenBank/DDBJ databa 


ses . 


DR 


EMBL; BC009333; AAH09333.1; 





DR GO; GO: 0016021; C: integral to membrane; IEA. 

DR GO; GO: 0004872; F: receptor activity; IEA. 

DR GO; GO: 0007165; P: signal transduction; IEA. 

DR InterPro; IPR000488; Death. 

DR InterPro; IPR000906; ZU5. 

DR Pfam; PF00531; death; 1. 

DR Pfam; PF00791; ZU5; 1. 

DR SMART; SM00005; DEATH; 1. 

DR SMART; SM00218; ZU5; 1. 

KW Receptor; Transmembrane. 

FT NON_TER 1 1 

SQ SEQUENCE 544 AA; 59949 MW; 350A7BA53375CCAE CRC64; 

Query Match 59.4%; Score 2845; DB 4; Length 544; 

Best Local Similarity 100.0%; Pred. No. 1.9e-253; 

Matches 541; Conservative 0; Mismatches 0; Indels 0; Gaps 0 

DVAL WGLI AVAVCLVLLLLVLI LVYCRKKEGLDS DVADS S I LT S GFQPVS I KPSKADNP 417 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
DVALYVGLIAVAVCLVLLLLVLILVYCRKKEGLDSDVADSSILTSGFQPVSIKPSKADNP 63 

HLLTIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSSPTSEAEE 477 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
HLLTIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSSPTSEAEE 123 

FVS RLSTQNYFRSLPRGTSNMTYGTFNFLGGRLMIPNTGISLLIPPDAIPRGKIYEIYLT 537 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
FVSRLSTQNYFRSLPRGTSNMTYGTFNFLGGRLMIPNTGISLLIPPDAIPRGKIYEIYLT 183 

LHKPEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVI LAMDHCGEPSPDSWSLRLKKQSCE 597 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
LHKPEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVI LAMDHCGEPSPDSWSLRLKKQSCE 243 

GSWEDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALSVAAAKRLKLLLFAPV 657 
I I I I I I I I I I ! I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
GSWEDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALSVAAAKRLKLLLFAPV 303 

ACTSLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPS 717 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
ACTSLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPS 363 

SLWKSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWVWQVEGDGQSFS 777 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 
SLWKSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWVWQVEGDGQSFS 423 

INFNITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISSLDPPCRRGADWRTLA 837 

I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I II I I I I I I I I I I II I I I 

INFNITKDT RFAELLAL E S EAGVP ALVG P S AFK I PFLIRQKIISSLDPPC RRGADWRT LA 483 

QKLHLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVSEAE 897 

I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I I I 

QKLHLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVSEAE 543 



Qy 


358 


Db 


4 


Qy 


418 


Db 


64 


Qy 


478 


Db 


124 


Qy 


538 


Db 


184 


Qy 


598 


Db 


244 


Qy 


658 


Db 


304 


Qy 


718 


Db 


364 


Qy 


778 


Db 


424 


Qy 


838 


Db 


484 


Qy 


898 


Db 


544 



RESULT 4 
008747 

ID 008747 PRELIMINARY ; PRT; 931 AA. 

AC 008747; 

DT 01-JUL-1997 (TrEMBLrel . 04, Created) 

DT 01-JUL-1997 (TrEMBLrel. 04, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Rostral cerebellar malformation protein. 

GN UNC5H3 OR RCM. 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57B6/SJL; 

RX MEDLINE=97271898; PubMed=912 6743 ; 

RA Ackerman S.L., Kozak L.P., Przyborski S.A., Rund L.A., Boyer B.B., 

RA Knowles B.B.; 

RT "The mouse rostral cerebellar malformation gene encodes an UNC-5-like 

RT protein."; 

RL Nature 38 6:838-842(1997). 

DR EMBL; U72634; AAB54103.1; -. 

DR MGD; MGI: 1095412; Unc5h3. 

DR GO; GO: 0005886; C:plasma membrane; IC. 

DR GO; GO: 0005042; F:netrin receptor activity; IDA. 

DR GO; GO: 0005515; F:protein binding; IDA. 

DR GO; GO: 0007420; P:brain development; IMP. 

DR GO; GO: 0030334; P: regulation of cell migration; IMP. 

DR InterPro; IPR000488; Death. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR003598; Ig_c2. 

DR InterPro; IPR000884; TSP1. 

DR InterPro; IPR008085; TSP_1. 

DR InterPro; IPR000906; ZU5 . 

DR Pfam; PF00531; death; 1. 

DR Pfam; PF00047; ig; 1. 

DR Pfam; PF00090; tsp_l ; 2. 

DR Pfam; PF00791; ZU5; 1. 

DR PRINTS; PR01705; TSP1REPEAT . 

DR SMART; SM00005; DEATH; 1. 

DR SMART; SM00408; IGc2; 1. 

DR SMART; SM00209; TSP1; 2. 

DR SMART; SM00218; ZU5; 1. 

DR PROSITE; PS50835; IG_LIKE; 1. 

DR PROSITE; PS50092; TSP1; 2. 

KW Immunoglobulin domain. 

SQ SEQUENCE 931 AA; 103062 MW; 8A5D951A4EECA179 CRC64 ; 

Query Match 58.2%; Score 2787; DB 11; Length 931; 

Best Local Similarity 57.3%; Pred. No. 9.7e-248; 

Matches 522; Conservative 153; Mismatches 208; Indels 28; Gaps 9 

Qy 9 PALLGIVLAAWLRGSGAQQSA TVANPVPGANPDLLPHFLVEPEDVYIVKNKPVLLVC 65 

IN • I : I I I II : I I : I I I I I : I I I : I I I I I I I I I | 

Db 26 PAL — ALLSASGTGSAAQDDEFFHELPETFPSDPPEPLPHFLIEPEEAYIVKNKPVNLYC 83 



Qy 66 KAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLEEYWCQ 125 

II IIIIIHIII III I III:: I : I I I I I I : I I I I II : : I I I : I I I I 
Db 84 KASPATQIYFKCNSEWVHQKDHWDERVDETSGLIVREVSIEISRQQVEELFGPEDYWCQ 143 

Qy 126 CVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAEVEWLR 185 

I I I I I I : I I I I I : I I I : I I I III I I I I I I I I I I I I I : : I I I I I I I I I I I I I I I : 
Db 144 CVAWS S AGTTKS RKAYVRI AYLRKT FEQEP LGKEVS LEQEVLLQCRP PEGI PVAEVEWLK 203 

Qy 186 NEDLVDPSLDPNVYITREHSLWRQARIADT7^YTCVAKNIV7VRRRSASAAVIVYVNGGW 245 

I I I : : I I : I I III : I : I : : : I I I I : I I I I I I I I I I I I I I : I : I : I MINIMI 
Db 204 NEDIIDPAEDRNFYITIDHNLIIKQARLSDTANYTCVAKNIVAKRKSTTATVIVYVNGGW 263 

Qy 24 6 STWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGSWSPWS 305 

I I I I II I I I : : I II I : I II : I : I I II I II II I II II M : I I I II I I I II I I I : II 

Db 264 STWTEWSVCNSRCGRGYQKRTRTCTNPAPLNGGAFCEGQSVQKIACTTLCPVDGRWTSWS 323 

Qy 306 KWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSASGPEDVALYVGL 365 

II I II : II I I I III: I I I : I II : : I I I : : I II I I : : I : I M I M I : 

Db 324 KWSTCGTECTHWRRRECTAPAPKNGGKDCDGLVLQSKNCTDGLCMQAAPDSDDVALYVGI 383 

Qy 366 -IAVAVCLVLLLLVLILVYCRKKEGLDSDVADSSILTSGFQPVSIKPSKADNPHLLTIQP 424 

M I II I : : : I : II : : I I : M I I I I I I I : I I : : I II : I 
Db 384 VIAVTVCLAITVWALFVYRKNHRDFESDIIDSSALNGGFQPVNIKAARQD LLAVPP 44 0 

Qy 425 DLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSS PTSEAEEFVS 480 

II:: I : I : I II Ml MM : : : M I I : III 
Db 441 DLTSAAAMYRGPWALHD-VSDKIPMTNSPILDPLPNLKIKVYNSSGAVTPQDDIlAEFSS 499 

Qy 481 RLS TQNYF RSLPRGT — SNMTYGTFNFLGGRLMI PNTGI SLLI PPDAI 526 

Ml II: Mill I M I I I I II I M I I M M I I I I II 

Db 500 KLSPQMTQSLLENEALNLKNQSLARQTDPSCTAFGTFNSLGGHLIIPNSGVSLLIPAGAI 559 

Qy 527 PRGKIYEIYLTLHKPEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVILAMDHCGEPSPDS 586 

I M : : II : I M M : MM I : I II I M M I I I I I I I I I I I I I I : II Ml : 
Db 560 PQGRVYEMYVTVHRKENMRPPMEDSQTLLTPWSCGPPGALLTRPVILTLHHCADPSTED 619 

Qy 587 WSLRLKKQSCEGSWEDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALSVAAA 64 6 

I : M I I : M M I I : M I I : I M M I I : : I I I : I I I I : : : I I I 
Db 620 WKIQLKNQAVQGQWEDWWGEENFTTPCYIQLDAEACHILTENLSTYALVGQSTTKAAA 67 9 

Qy 647 KRLKLLLFAPVACTSLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYH 706 

I I I I I M I : I M I I I M I I I II II I I I I I I M II : I M I I I : M I : II I I I I 
Db 680 KRLKLAIFGPLCCSSLEYSIRVYCLDDTQDALKEVLQLERQMGGQLLEEPK7VLHFKGSIH 739 

Qy 707 NLRLSIHDVPSSLWKSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWV 766 

I I I II II I : II I I I I I I I I I I I II I I I M M I I I I I I II I I M M M III I 
Db 740 NLRLSIHDITUiSLWKSKLLAKYQEIPFYHIWSGSQRNLHCTFTLERLSLNTVELVCKLCV 799 

Qy 767 WQVEGDGQSFSINFNITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISSLDPP 826 

1111 = 11 I M :::: M |: : : : Mill II MM: MM I 
Db 800 RQVEGEGQI FQLNCTVSEEPTGI DLPLLDPASTITTVTGPSAFSI PLPI RQKLCSSLDAP 859 

Qy 827 CRRGADWRTLAQKLHLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAGLGQP 886 

II I I I M M : I I : I : M I M III : M M I I I :: I I : M II III : : | : 
Db 860 QTRGHDWRMLAHKLNLDRYLNYFATKSSPTGVILDLWE1AQNFPDGNLSMLAAVLEEMGRH 919 



Qy 

Db 



887 DAGLFTVSEAE 897 

: : : | : 
920 ETWS LAAEGQ 930 



RESULT 5 




Q8CD16 




ID 


Q8CD16 PRELIMINARY; PRT; 950 AA. 




AC 


Q8CD16; 




DT 


01-MAR-2003 (TrEMBLrel. 23, Created) 




DT 


01-MAR-2003 (TrEMBLrel. 23, Last sequence update) 




DT 


01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 




DE 


Unc5 homolog. 




GN 


UNC5H3. 




OS 


Mus musculus (Mouse) . 




OC 


Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 


OC 


Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 


OX 


NCBI TaxID=10090; 




RN 


[1] 




RP 


SEQUENCE FROM N.A. 




RC 


STRAIN=C57BL/6J; TISSUE=Testis ; 




RX 


MEDLINE=22354683; PubMed=12466851 ; 




RA 


The FANTOM Consortium, 




RA 


the RIKEN Genome Exploration Research Group Phase I & II 


Team; 


RT 


"Analysis of the mouse transcriptome based on functional 


annotation 


RT 


60, 770 full-length cDNAs . " ; 




RL 


Nature 420:563-573(2002). 




DR 


EMBL; AK031655; BAC27495.1; 




DR 


PIR; PT0566; PT0566. 




DR 


MGD; MGI: 1095412; Unc5h3 . 




DR 


GO; GO: 0005886; C:plasma membrane; IC. 




DR 


GO; GO: 0005042; F:netrin receptor activity; IDA. 




DR 


GO; GO: 0005515; F: protein binding; IDA. 




DR 


GO; GO: 0007420; P:brain development; IMP. 




DR 


GO; GO: 0030334; P: regulation of cell migration; IMP. 




DR 


InterPro; IPR000488; Death. 




DR 


InterPro; IPR003599; Ig. 




DR 


InterPro; IPR007110; Ig-like. 




DR 


InterPro; IPR003598; Ig c2 . 




DR 


InterPro; IPR000884; TSP1. 




DR 


InterPro; IPR008085; TSP_1. 




DR 


InterPro; IPR000906; ZU5 . 




DR 


Pfam; PF00531; death; 1. 




DR 


Pfam; PF00047; ig; 1. 




DR 


Pfam; PF00090; tsp 1; 2. 




DR 


Pfam; PF00791; ZU5; 1. 




DR 


PRINTS; PR01705; TSP1REPEAT. 




DR 


SMART; SM00005; DEATH; 1. 




DR 


SMART; SM00409; IG; 1. 




DR 


SMART; SM00408; IGc2; 1. 




DR 


SMART; SM00209; TSP1; 2. 




DR 


SMART; SM00218; ZU5; 1. 




DR 


PROSITE; PS50835; IG_LIKE; 1. 




DR 


PROSITE; PS50092; TSP1; 2. 




SQ 


SEQUENCE 950 AA; 105398 MW; 1E8FC74703351AF6 CRC64 ; 





Query Match 



57.8%; Score 2767.5; DB 11; Length 950; 



Best Local Similarity 56.1%; Pred. No. 6.3e-246; 

Matches 522; Conservative 153; Mismatches 208; Indels 47; Gaps 10; 



Qy 9 PALLGIVLAAWLRGSGAQQSA TVANPVPGANPDLLPHFLVEPEDVYIVKNKPVLLVC 65 

III : I : I I I II : I I : I I I I I : I I I : I I I I I I I I I I 

Db 26 PAL — ALLSASGTGSAAQDDEFFHELPETFPSDPPEPLPHFLIEPEEAYIVKNKPWLYC 83 

Qy 66 KAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLEEYWCQ 125 

II 11111:1111 III I III:: I :||| II I :||||||::|| hill! 
Db 84 KASPATQIYFKCNSEWVHQKDHWDERVDETSGLIVREVSIEISRQQVEELFGPEDYWCQ 143 

Qy 126 CVAWS S SGTTKSQKAYI RIARLRKNFEQEPLAKEVSLEQGI VLPCRP PEGI PPAEVEWLR 185 

I I I I I I : I I I I I : I I I : I I I III I I I I I I I I I I I I I : : I I I I I I I I I I I I I I I : 
Db 144 CVAWSSAGTTKSRKAYVRIAYLRKTFEQEPLGKEVSLEQEVLLQCRPPEGIPVAEVEWLK 203 

Qy 18 6 NEDLVDPSLDPNVYITREHSLVVRQARLADTANYTC 245 

I I I : : I I : I I III : I : I : : : I I I I : I I I I I I I I I I I I I I : I : I : I I I I I I I I I I 
Db 204 NEDIIDPAEDRNFYITIDHNLIIKQ7VRLSDTANYTCVAKNIVAKRKSTTATVI VTVNGGW 263 

Qy 246 STWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGSWSPWS 305 

I I I I I I I I I : : I I I I : I I I : I : I I I I I I II I I I I I I II : I I I II I I I I I I I I : II 
Db 264 STWTEWSVCNSRCGRGYQKRTRTCTNPAPLNGGAFCEGQSVQKIACTTLCPVDGRWTSWS 323 

Qy 306 KWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVH 351 

I I I I I : I I I I I I I I : I I I : I I I : : I I I : : I I I I I : 
Db 324 KWSTCGTECTHWRRRECTAPAPKNGGKDCDGLVLQSKNCTDGLCMQGFIYPISTEHRPQN 383 

Qy 352 SASGPEDVALYVGL-IAVAVCLVLLLLVLILVYCRKKEGLDSDVADSSILTSGFQ 405 

II : I I I I I I I : I I I I I I : : : I : I I : : I I : I I I I III 

Db 384 EYGFSSAPDSDDVALYVGIVIAVTVCLAI TVWALFVYRKNHRDFESDIIDSSALNGGFQ 443 

Qy 406 PVSIKPSKADNPHLLTIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHT 465 

I I : I I : : I II : III:: hi : I II :|| :| II : 
Db 444 PVNIKAARQD LLAVPPDLTSAAAMYRGPVYALHD-VSDKI PMTNSPILDPLPNLKIK 499 

Qy 4 66 LHHSS PTSEAEEFVSRLS TQNYF RSLPRGT — SNMTYGTFNFLG 507 

: : : I I I : I I I : I I I I : : I I I I I : I I I I I I 

Db 500 WNSSGAVTPQDDLAEFSSKLSPQMTQSLLENEALNLKNQSLARQTDPSCTAFGTFNSLG 559 

Qy 508 GRLMIPNTGISLLIPPDAIPRGKIYEIYLTLHKPEDVRLPLAGCQTLLSPIVSCGPPGVL 567 

I I : I I I : I : I I I I I I I I : I : : I I : I : I : I : |::| |: I I I I : I : I I I I II I I 
Db 560 GHLIIPNSGVSLLIPAGAIPQGRVYEMYVTVHRKENMRPPMEDSQTLLTPWSCGPPGAL 619 

Qy 568 LTRPVILAMDHCGEPSPDSWSLRLKKQSCEGSWEDVLHLGEEAPSHLYYCQLEASACYVF 627 

I I I I I I I : M : I I : I : : I I I : : I I I I I : : I I I : I I I : I | | : : 

Db 62 0 LTRPVILTLHHCADPSTEDWKIQLKNQAVQGQWEDWWGEENFTTPCYIQLD7VEACHIL 679 

Qy 62 8 TEQLGRFALVGEALSVA7yVKRLKLLLFAPVACTSLEYNIRVYCLHDTHDALKEWQLEKQ 687 

II I : I I I I : : : II I I II II : I I : I : I I I I : I I I I I I II I I I I I I : I I I : I 

Db 680 TENLSTYALVGQSTTKAAAKRLKLAI FGPLCCSSLEYSIRVYCLDDTQDALKEVLQLERQ 739 

Qy 68 8 LGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLWKSKLLVSYQEIPFYHIWNGTQRYLHCT 747 

: I I I I : : I I : I II I MINIMI: I I I I I I II I I I I I I I I I I : I : I I I I I I 
Db 740 MGGQLLEEPKALRFKGSIHNLRLSIHDIAHSLWKSKLLAKYQEIPFYHIWSGSQRNLHCT 799 

Qy 748 FTLERVSPSTSDLACKLWWQVEGDGQSFSINFNITKDTRFAELLALESEAGVPALVGPS 8 07 

I I I I I : I : I : I I I I I I I I I : I I I : I : : :: : I I : : : : I I I 



Db 800 FTLERLSLNTVELVCKLCVRQVEGEGQIFQLNCTVSEEPTGIDLPLLDPASTITTVTGPS 859 

Qy 808 AFKIPFLIRQKIISSLDPPCRRGADWRTLAQKLHLDSHLSFFASKPSPTAMILNLWEARH 867 

II II MM: MM I II III II | | : | | : | : : | | : | III : I I : I I I | : : 
Db 860 AFS I PLPI RQKLCS SLDAPQTRGHDWRMLAHKLNLDRYLNYFATKS S PTGVI LDLWEAQN 919 

Qy 868 FPNGNLSQLAAAVAGLGQPDAGLFTVSEAE 897 

11:1111 III : :|: : :: :| : 
Db 92 0 FPDGNLSMLAAVLEEMGRHETWYLAAEGQ 949 



RESULT 6 
Q7T2Z5 



ID Q7T2Z5 PRELIMINARY; PRT; 931 AA. 

AC Q7T2Z5; 

DT 01-OCT-2003 (TrEMBLrel. 25, Created) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE UNC5-like protein 3. 

OS Gallus gallus (Chicken) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Archosauria; Aves; Neognathae; Galliformes; Phasianidae; Phasianinae; 

OC Gallus. 

OX NCBI_TaxID=9031; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Guan W. , Condic M.L.; 

RT "Characterization of Netrin-1, Neogenin and cUNC-5H3 expression during 

RT chick dorsal root ganglia development."; 

RL Gene Expr. Patterns 3:369-373(2003). 

DR EMBL; AY187310; AA067275.1; -. 

SQ SEQUENCE 931 AA; 102906 MW; 1E23A0D84F2E2C62 CRC64; 



Query Match 57.6%; Score 2761; DB 13; Length 931; 

Best Local Similarity 57.0%; Pred. No. 2.4e-245; 

Matches 518; Conservative 151; Mismatches 212; Indels 28; Gaps 9; 



Qy 9 PALLGIVLAAWLRGSGAQQS ATVANPVPGANPDLLPHFLVEPEDVYIVKNKPVLLVC 65 

II I M I M I I : I I : M I I I : I I I : M M II I I I I 

Db 2 6 PAL — AVLGASRPGSAAQDDDFFHELPETFPSDPPEPLPHFLIEPEEAYIVKNKPVNLYC 83 

Qy 66 KAVPATQIFFKCNGEWWQVT)HVIERSTDGSSGLPTMEVRINVSRQQv^KVFGLEEYWCQ 125 

II II I I I : II I I III I MM: I MM I I I : II II I I : : M I : II II 

Db 84 KAS PATQI YFKCNSEWVHQKDHWDERVDETSGLI VCEVS I EI SRQQVEELFGPEDYWCQ 143 

Qy 126 CVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAEVEWLR 185 

I II I I I M I I M M M : II I Ml I II I II I I I II M : : I II M II I I II I I I I : 

Db 144 CVAWS SAGTTKS RKAYVRI AYLRKT FEQEPLGKEVS LEQEVLLQCRP PEGI PYAEVEWLK 203 

Qy 186 NEDLVTDPSLDPNWITREHSLVVRQAi^LADTANYT 245 

II : : : I I I I I I I : I M : : : M M : M M I I I I I M M I : I : I : I I M I M I II 

Db 204 N E E VI D P VE D RN F Y I T I DHN L 1 1 KQARL S DT AN YT C VAKN I VAKRK S T TAT VI VYVN G GW 263 

Qy 246 STWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGSWSPWS 305 

I I I I I I I I I I I I M I I : I : I I I I I I I I I I M I I I I II I I II II M II I I : II 

Db 264 STWTEWSACNSRCGRGFQKRTRTCTNPAPLNGGAFCEGQNVQKIACTTLCPVDGKWTSWS 323 



Qy 306 KWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSASGPEDVALYVGL 365 

I I I I I : I II I I I I ! : I I I : I I I : : I : I j : : I | | | | : : | : | I | | I I I : 
Db 324 KWSTCGTECTHWRRRECTAPAPKNGGKDCEGLVLQSKNCTDGLCMQAAPDSDDVALYVGI 383 

Qy 366 -IAVAVCLVLLLLVLILVYCRKKEGLDSDVADSSILTSGFQPVSIKPSKADNPHLLTIQP 424 

I [ I I I I : : : I : I I : : I I : I I I I I I I I I : I I : : I I I : I 
Db 384 VIAVIVCLAI SVWALFVYRKNHRDFESDI I DSSALNGGFQPVNI KAARQD LLAVPP 440 

Qy 425 DLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSS PTSEAEEFVS 480 

II:: I : I : I II : I I : I I I : : : : : I | | : | | 

Db 441 DLTSAAAMYRGPVYALHD-VSDKIPMTNSPILDPLPNLKIKVYNTSGAVTPQDELSDFSS 499 

Qy 4 81 RLS TQNYF RSLPRGT — SNMTYGTFNFLGGRLMI PNTGI SLLI PPDAI 526 

: I I M: : I I I I I : I I I I I I I I : I I I : I : I I I I I I : 

Db 500 KLSPQITQSLLENET LNVKNQS LARQTDP S CTAFGT FNS LGGHLVI PN S GVS LLI PAGAV 559 

Qy 527 PRGKIYEIYLTLHKPEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVI LAMDHCGEPSPDS 586 

I : I : : I I : I : I : I : I : I I : I I I I : I : I I I I I I I I I I I I I : I I II I I : I 
Db 560 PQGRVYEMYVTVHRKEGMRPPVEDSQTLLTPWSCGPPGALLTRPWLTMHHCAEPNMDD 619 

Qy 587 WSLRLKKQSCEGSWEDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALSVAAA 646 

I : : I I I : : I I I I I : : I I I : III: I I : : I I I : | | | | : : : : I I I 
Db 620 WQIQLKHQAGQGPWEDWWGEENFTTPCYIQLDPEACHILTETLSTYALVGQSITKAAA 679 

Qy 647 KRLKLLLFAPVACTSLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYH 706 

I I I I I : I I :: I : I I I I : I I I I I I II I I I I I I : I I I : I : I I I I : : I I : I I I I I I 
Db 680 KRLKLAI FGPLSCSSLEYSIRVYCLDDTQDALKEVLQLERQMGGQLLEEPKTLHFKGSTH 739 

Qy 7 07 NLRLSIHDVPSSLWKSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWV 7 66 

I I I I I I I I : I I I I I I I I II I I I II I I : I II I I I I I I I I I I : I : I III I 
Db 740 NLRLSIHDIAHSLWKSKLPAKYQEIPFYHIWSGCQRNLHCTFTLERFSLNTLELVCKLCV 799 

Qy 767 WQVEGDGQSFSINFNITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISSLDPP 826 

I I I I : I I I : I : : : : : : : : I : : I I I : I I I I I I I I : I I I I I 
Db 800 RQVEGEGQIFQLNCSVSEEPTGIDYPIMDSAGSITTIVGPNAFSIPLPIRQKLCSSLDAP 859 

Qy 827 CRRGADWRTIAQKLHLDSHLSFFASKPSPT7^VIILNLWEARHFPNGNLSQL7WVVAGLGQP 886 

II Ml II II II : I I I : I III : I I : I I I I : : I I : I I I I III : : I : 
Db 860 QTRGHDWRMLAHKLKLDRYLNYFATKSSPTGVILDLWEAQNFPDGNLSMLAAVLEEMGRH 919 

Qy 8 87 DAGLFTVSE 895 

: : : I 

Db 920 ETWSLAAE 928 



RESULT 7 
095185 

ID 095185 PRELIMINARY; PRT; 931 AA. 

AC 095185; 

DT 01-MAY-1999 (TrEMBLrel. 10, Created) 

DT 01-MAY-1999 (TrEMBLrel. 10, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Transmembrane receptor UNC5C. 

GN UNC5C. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 



OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Brain; 

RX MEDLINE=99000841; PubMed=9782087 ; 

RA Ackerman S.L., Knowles B.B.; 

RT "Cloning and mapping of the UNC5C gene to human chromosome 4q21-q23."; 

RL Genomics 52:205-208(1998). 

DR EMBL; AF055634; AAC67491.1; 

DR Genew; HGNC: 12569; UNC5C. 

DR GO; GO: 0005042; F:netrin receptor activity; T AS . 

DR GO; GO: 0007411; P:axon guidance; T AS . 

DR GO; GO: 0007420; P:brain development; TAS . 

DR InterPro; IPR000488; Death. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR003598; Ig_c2 . 

DR InterPro; IPR000884; TSP1. 

DR InterPro; IPR008085; TSP_1. 

DR InterPro; IPR000906; ZU5 . 

DR Pfam; PF00531; death; 1. 

DR Pfam; PF00047; ig; 1. 

DR Pfam; PF00090; tsp_l; 2. 

DR Pfam; PF00791; ZU5; 1. 

DR PRINTS; PR01705; TSPlREPEAT. 

DR SMART; SM00005; DEATH; 1. 

DR SMART; SM00408; IGc2; 1. 

DR SMART; SM00209; TSP1; 2. 

DR SMART; SM00218; ZU5; 1. 

DR PROSITE; PS50835; IG_LIKE; 1. 

DR PROSITE; PS50092; TSPl; 2. 

KW Immunoglobulin domain; Receptor. 

SQ SEQUENCE 931 AA; 103101 MW; EFD71122C98DABB8 CRC64; 

Query Match 57.5%; Score 2755; DB 4; Length 931; 

Best Local Similarity 56.4%; Pred. No. 8.7e-245; 

Matches 514; Conservative 154; Mismatches 215; Indels 28; Gaps 9 

Qy 9 PALLGIVLAAWLRGSGAQQS AT VAN P VP GAN P D L L P H F L VE P E D VY I VKN K P VL L VC 65 

Ml : I : I I I I I : I I : I I I I I : I I I : I I I I I I I I I I 

Db 26 PAL — ALLSASGTGSAAQDDDFFHELPETFPSDPPEPLPHFLIEPEEAYIVKNKPVNLYC 83 

Qy 66 KAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLEEYWCQ 125 

II MMhllll III I II::: I :||| II I :||||||::|| |:|||| 
Db 84 KAS PATQI YFKCNSEWVHQKDHI VDERVDETSGLI VREVS I EI S RQQVEELFGPEDYWCQ 143 

Qy 126 CVAWSSSGTTKSQKAYIRIARLRKNFEQEPIAKEVSLEQGIVLPCRPPEGIPPAEVEWLR 185 

I I I I I I : I I I I I : I I I : I I I III I I I I II I I I I I I I : : I I I I I I I I I MINI: 
Db 144 CVAWSSAGTTKSRKAYVRIAYLRKTFEQEPLGKEVSLEQEVLLQCRPPEGIPVAEVEWLK 203 

Qy 186 NEDLVDPSLDPNWITREHSLWRQARLADTANYTCVAKNIVARRRSASAAVIVTWGGW 24 5 

I I I : : I I I I III : I : I : : : I I I I : I I I I I I I I I I I I I I : I : I : I I I I I I I I I I 
Db 204 NEDIIDPVEDRNFYITIDHNLIIKQARLSDTANYTCVAKNIVAKRKSTTATVIWVNGGW 263 



Qy 246 STWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGSWSPWS 305 

MINIMI:: I I I I : I | | : | : | | | I I I I I I I I I I I I I : I I I II lllllll Mill 
Db 264 STWTEWSVCNSRCGRGYQKRTRTCTNPAPLNGGAFCEGQSVQKIACTTLCPVDGRWTPWS 323 



Qy 


Q A /" 


UD 


"3 O A 

oz 4 


Qy 


o c c 


UD 


Q A 

6 o 4 


Qy 


/IOC 

425 


UD 




Qy 


481 


Db 


C C\ C\ 

500 


Qy 


527 


Db 


560 


Qy 


587 


Db 


620 


Qy 


647 


Db 


680 


Qy 


707 


Db 


740 


Qy 


767 


Db 


8 00 


Qy 


827 


Db 


860 


Qy 


887 


Db 


920 



I I I I I : I I I I I I I I : I I I : I II :: I I I : : I I I I I : : I : I I I II I I 



I I I I I I : : : I : I I : : I I : I I I I I I I I I : I I : : I I I 



II:: I : I : I II : I I : I I I 



RLS TQNYF RS LPRGT- - SNMT YGT FNFLGGRLMI PNTGI S LLI P P DAI 526 

: I I II: : I I I I I : I : I I I I I I : : I I : I : I I I I | II 



I : I :: I I : I : I : I : I : I I : I I I I : I : I I I I I I I I I I I I I : I I I I : I 



WSLRLKKQSCEGSWEDVLHLGEEAPSHLYYCQLEASAC WFTEQLGRFALVGEALSVAAA 64 6 
I : M I : : I I I I I : : I I I : I : I : I I I : : I I I : I I I I : : || I 
WKILLKNQAAQGQWEDWWGEENFTTPCYIKLDAEACHILTENLSTYALVGHSTTKAAA 67 9 

KRLKLLLFAPVACTSLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYH 7 06 
I I I I I : I I : 1:1111:111111 II Mill:: | | : | | | | | : : | | : I I I I I 
KRLKLAIFGPLCCSSLEYSIRVYCLDDTQDALKEILHLERQTGGQLLEEPKALHFKGSTH 739 

NLRLSIHDVPSSLWKSKLLVSYQEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWV 766 
I I I I I I I I : I I I I I I I I I I I I I I I I : I : I : I I I II I I I I I I I : I : I III I 
NLRLSIHDIT^HSLWKSKLLAKYQEIPFYHVWSGSQRNLHCTFTLERFSLNTVELVCKLCV 7 99 

WQVEGDGQSFSINFNITKDTRFAELLALESEAGVPALVGPSAFKIPFLIRQKIISSLDPP 826 

I I I I : I I I : I : : : : : I I : : : I I I I I I I MM: I I I I I 

RQVEGEGQIFQLNCTVSEEPTGIDLPLLDPANTITTVTGPSAFSIPLPIRQKLCSSLDAP 859 

CRRGADWRTLAQKLHLDSHLS FFASKPS PTAMI LNLWEARHFPNGNLSQLAAAVAGLGQP 886 

II III II Ihll : I : : I I : I III : I I : I I I I : : I I : || I I III : : I : 
QTRGHDWRMLAHKLNLDRYLNYFATKSSPTGVILDLWEAQNFPDGNLSMLAAVLEEMGRH 919 



RESULT 8 
Q8JGT4 

ID Q8JGT4 PRELIMINARY; PRT; 943 AA. 

AC Q8JGT4; 

DT 01-OCT-2002 (TrEMBLrel. 22, Created) 
DT 01-OCT-2002 (TrEMBLrel. 22, Last sequence update) 
DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 
DE UNC-5 receptor. 

OS Xenopus laevis (African clawed frog) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
OC Amphibia; Batrachia; Anura; Mesobatrachia ; Pipoidea; Pipidae; 
OC Xenopodinae; Xenopus. 



OX NCBIJTaxID=8355; 

RN [1] 

RP SEQUENCE FROM N. A. 

RA Anderson R.B., Holt C.E.; 

RT "Expression of UNC-5 in the developing Xenopus visual system."; 

RL Submitted (APR-2 002) to the EMBL/ GenBank/DDBJ databases. 

DR EMBL; AY099459; AAM34486.1; -. 

DR GO; GO: 0004872; F: receptor activity; IEA. 

DR GO; GO: 0007165; P: signal transduction; IEA. 

DR InterPro; IPR000488; Death. 

DR InterPro; IPR003599; Ig. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR003598; Ig_c2. 

DR InterPro; IPR000884; TSP1. 

DR InterPro; IPR008085; TSP 1. 



DR InterPro; IPR000906; ZU5 . 

DR Pfam; PF00531; death; 1. 

DR Pfam; PF00047; ig; 1. 

DR Pfam; PF00090; tsp_l; 2. 

DR Pfam; PF00791; ZU5; 1. 

DR PRINTS; PR01705; TSP1REPEAT. 

DR SMART; SM00005; DEATH; 1. 

DR SMART; SM00409; IG; 2. 



DR SMART; SM00408; IGc2; 1. 

DR SMART; SM00209; TSP1; 2. 

DR SMART; SM00218; ZU5 ; 1. 

DR PROSITE; PS50835; IG_LIKE; 1. 

DR PROSITE; PS50092; TSP1; 2. 

KW Immunoglobulin domain; Receptor. 

SQ SEQUENCE 943 AA; 105083 MW; A024E24A7EDB6175 CRC64; 



Query Match 55.2%; Score 2646.5; DB 13; Length 943; 

Best Local Similarity 53.0%; Pred. No. 9.3e-235; 

Matches 496; Conservative 163; Mismatches 229; Indels 47; Gaps 8; 

Qy 10 ALLGIVLAAWLRG SGAQQSATVANPVPGANPDLLPHFLVEPEDVYIVKNKPVL 62 

II : I I : : I = I : : II : I I I I I : I I I I I I I I I I I I 

Db 10 AALAAI LVALILSCNFPSSTAGIEYSDVLPDSFPSAPAESLPHFLLEPEDAYIVKNKPVE 69 

Qy 63 LVCKAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLEEY 122 

I I I I I I I I I I : I I I I I I I I I I I : : I : I I I I : I I I I I I I I :: I I I I : I 
Db 7 0 LVCKANPATQIYFKCNGEmmQNDHITKERVDDVTGLWREVQIEVSRQQVEELFGLEDY 129 

Qy 123 WCQCVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAEVE 182 

I I I I I I I I I : I I I I I ::: I : I I I I I I I I : I I I I I I I : I I I : I I I I I I I : I I I I I I 
Db 130 WCQCVAWSSAGTTKSKRSYVRIAYLRKNFDQEPLGKEVALEQEALLQCRPPEGVPPAEVE 18 9 

Qy 183 WLRNEDLVDPSLDPNWITREHSLVVRQARLADTANYTCVAKNIVARRRSASA 242 

I I : I I : : : I I : I I II : I : I : : : II I I : I I I I I I I I : I I I I I : I I I : I I I I : I I 
Db 190 WLKNEEIIDPTKDTNFLITIDHNLIIKQARLSDTANYTCVSKNIVAKRRSTTATVIVFVN 249 

Qy 243 GGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGSWS 302 

1111:11111 I : II I I I I I : I : I I I I I I I I I I MM II I I M II I I I I : 
Db 250 GGWS SWTEWS PCNNRCGHGWQKRTRTCTNPAPLNGGTMCEGQQYQKFACNTMCPVDGGWT 309 

Qy 303 PWSKWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHS 352 

I I I I I I I MINIUM: I I : II I : M I II : M I I I I : : 



Db 



310 EWSKWSACSTECTHWRSRECNAPTPKNGGKDCSGMLLDSKNCTDGLCMQNKRVLGETKSR 369 



Qy 353 -ASGPEDVALYVGL-IAVAVCLVLLLLVLILVYCRKKEGLDSDVADSS-ILTSGFQPVSI 409 

Mill II :|: : ::||: I |:|| I |:|: III II II II: 

Db 37 0 LLESTGDVALYAGLWAIFIVIILLMAVGIWYRRNCREFDTDITDSSAALTGGFHPVNF 429 

Qy 410 KPSKADNPHLL— TIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLH 467 

11:11 I : : I II I : I : I : : I I : I : I I II M : : : 

Db 430 KTSRHDNSQLIHPAMQPDLTANAGIYRGNMYALQDS-ADKIPMTNSPLLDPLPNLKIKVY 488 

Qy 468 HSS PTSEAEEFVSRLSTQN YFRSLPRGTSNMTYGTF 503 

: I I II : : : : | : : I I I : I I III 

Db 489 NSSTVGSSPGIHDGNNLLGTKPTGTYPSDNNIMNARNKNMSMQHLLTLPRDSSNSVTGTF 54 8 

Qy 504 NFLGGRLMI PNTGI SLLI PPDAI PRGKI YEI YLTLHKPEDVRLPLAGCQTLLS PI VSCGP 563 

I I I I I I I I I : I I I I I I II : II I I : I I : : I I : II I I I : I I I I :: I I I 
Db 549 GSLGGRLTFPNTGVSLLIPQGAIPQGKYYEMYLMINKRENTVLPSEGTQTILSPIITCGP 608 

Qy 564 PGVLLTRPVI LAMDHCGEPSPDSWSLRLKKQSCEGSWEDVLHLGEEAPSHLYYCQLEASA 623 

I : I I : I I I I : I I : : I I : I I I I : I : I I : I : I | | : Mill:: 
Db 609 TGLLLCKPVILTVPHCADINTSDWILQLKTQSHQGNWEEWTLNEETLNTPCYCQLESHS 668 

Qy 624 CYVFTEQLGRFALVGEALSVAAAKRLKLLLFAPVACTSLEYNIRVYCLHDTHDALKEWQ 683 

I : : I I I : I IN: I : I I I I : I : I I I : I I I I I I I : : I I I : II I I I I I I : : 
Db 669 CHTLLDQLGTYAFVGESYSRSAIKRLQLAIFAPMLCTSLEYNLKVYCVEDTPDALKEVLE 728 

Qy 684 LEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLWKSKLLVSYQEIPFYHIWNGTQRY 743 

I I I III I : : I I : : I I I I I I I I I I I I I I I : I I I I : I I I : I I I I I I I I I I : I : I I 
Db 729 LEKTLGGYLVEEPKLLMFKDSYHNLRLSIHDIPHSLWRSKLMAKYQEIPFYHIWSGSQRT 78 8 

Qy 744 LHCTFTLERVSPSTSDLACKLWVWQVEGDGQSFSINFNITKDTRFAELLALESEAGVPAL 803 

I I I I I I I I I I ::: I I I : I I I I I : I I I : : : : : : : : : I I 
Db 789 LHCTFTLERYSLAATELTCKICVRQVEGEGQIFQLHTLLEENVKSFDPFCSQAENSVTTH 848 

Qy 804 VGPSAFKIPFLIRQKIISSLDPPCRRGADWRTLAQKLHLDSHLSFFASKPSPTAMILNLW 863 

: I I I I I I I I I I I I I : I I I I II III I I I I I : I : I : : I I : I III : I I : I I 
Db 849 LGP YAFKI P FS I RQKI CN S LDAPNS RGNDWRLLAQKLCMDRYLN YFATKAS PTGVI LDLW 908 

Qy 864 EARH F PN GN L SQLAAAVAGL GQ P DAGL FT VS EAEC 898 

III : I : | : | | : | : : | : : | : : : | 
Db 909 EALHQDDGDLNTLASALEEMGKSEMMLVMATDGDC 943 



RESULT 9 
Q80Y85 

ID Q80Y85 PRELIMINARY; PRT; 1008 AA. 

AC Q80Y85; 

DT 01-JUN-2003 (TrEMBLrel. 24, Created) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Unc5h2 protein (Fragment) . 

GN UNC5H2 . 

OS Mus mus cuius (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodent ia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxI D=l 0 0 9 0 ; 

RN [1] 



RP SEQUENCE FROM N . A. 

RC STRAIN-C57BL/ 6; TISSUE-Brain; 

RX MEDLINE=22388257; PubMed=12477932 ; 

RA Strausberg R.L., Feingold E.A., Grouse L.H. , Derge J.G., 

RA Klausner R.D., Collins F.S., Wagner L., Shenmen CM., Schuler G.D., 

RA Altschul S.F., Zeeberg B., Buetow K.H., Schaefer C.F., Bhat N.K., 

RA Hopkins R.F., Jordan H., Moore T., Max S.I., Wang J., Hsieh F., 

RA Diatchenko L., Marusina K. , Farmer A. A. , Rubin G.M., Hong L., 

RA Stapleton M. , Soares M.B., Bonaldo M.F., Casavant T.L., Scheetz T.E., 

RA Brownstein M.J., Usdin T.B., Toshiyuki S., Carninci P., Prange C, 

RA Raha S.S., Loquellano N.A., Peters G.J., Abramson R.D., Mullahy S.J., 

RA Bosak S.A., McEwan P. J., McKernan K.J., Malek J. A., Gunaratne P.H., 

RA Richards S., Worley K.C., Hale S., Garcia A.M. , Gay L.J., Hulyk S.W., 

RA Villalon D.K., Muzny D.M., Sodergren E.J., Lu X., Gibbs R. A. , 

RA Fahey J., Helton E., Ketteman M. , Madan A. , Rodrigues S., Sanchez A., 

RA Whiting M. , Madan A. , Young A.C., Shevchenko Y., Bouffard G.G., 

RA Blakesley R.W., Touchman J.W. , Green E.D., Dickson M.C., 

RA Rodriguez A.C., Grimwood J. , Schmutz J., Myers R.M. , Butterfield Y.S., 

RA Krzywinski M.I., Skalska U., Smailus D.E. f Schnerch A., Schein J.E., 

RA Jones S.J., Marra M.A. ; 

RT "Generation and initial analysis of more than 15,000 full-length human 

RT and mouse cDNA sequences."; 

RL Proc. Natl. Acad. Sci. U.S.A. 99:16899-16903(2002). 

RN [2] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6; TISSUE-Brain; 

RA Strausberg R. ; 

RL Submitted (MAR-2003) to the EMBL/ GenBank/DDB J databases. 

DR EMBL; BC048162; AAH48162.1; 

DR GO; GO: 0007165; P:signal transduction; IEA. 

DR InterPro; IPR000488; Death. 

DR InterPro; IPR003599; Ig. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR003598; Ig_c2 . 

DR InterPro; IPR000884; TSP1. 

DR InterPro; IPR008085; TSP_1. 

DR InterPro; IPR000906; ZU5. 

DR Pfam; PF00531; death; 1. 

DR Pfam; PF00047; ig; 1. 

DR Pfam; PF00090; tsp_l; 2. 

DR Pfam; PF00791; ZU5; 1. 

DR PRINTS; PR01705; TSP1REPEAT. 

DR SMART; SM00005; DEATH; 1. 

DR SMART; SM00409; IG; 2. 

DR SMART; SM004 08; IGc2; 1. 

DR SMART; SM00209; TSP1; 2. 

DR SMART; SM00218; ZU5; 1. 

DR PROSITE; PS50017; DEATH_DOMAIN; 1. 

DR PROSITE; PS50835; IG_LIKE; 1. 

DR PROSITE; PS50092; TSP1; 2. 

FT NON_TER 1 1 

SQ SEQUENCE 1008 AA; 110438 MW; BCE5CAOEC537C130 CRC64; 



Query Match 54.0%; Score 2585; DB 11 

Best Local Similarity 53.7%; Pred. No. 4.9e-229 
Matches 505; Conservative 151; Mismatches 235 



Length 1008; 

Indels 50; Gaps 14 



Qy 1 MAVRPGLWPALLGIVLAAW LRG--SGAQQSATVANPVPGANPDLLPHFLVEPEDV 53 

111:111:11 I I I I :: I I : 11:11:11:1 
Db 75 MRARSGVRSALLLALLLCWDPTPSLAGVDSAGQ VLPDSYPSAPAEQLPYFLLEPQDA 131 

Qy 54 YIVKNKPVLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQV 113 

I I I I II I I I I : I I I I I I : I I I I I I I I I IN : I I : : I I 1 : MINI 

Db 132 YIVKNKPVELHCRAFPATQIYFKCNGEWVSQNDHVTQESLDEATGLRVREVQIEVSRQQV 191 

Qy 114 EKVFGLEEYWCQCVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPP 173 

I :: I I I I : I I I I I I I I I I I I I I I I :: I I I I I I I I I I I : I I I I I I I I I : : : i MM 
Db 192 EELFGLEDYWCQCVAWSSSGTTKSRRAYIRIAYLRKNFDQEPLAKEVPLDHEVLLQCRPP 251 

Qy 174 E G I P P AE VEW L RN E D LVD P S L D PN VY I T REH S L WRQ AR LADT AN YT C VAKN I VARR R S A 233 

11:1 I I I I II : I I I : : II : I I : I : I : I : : I I I I I : I I I I I I I II I I I I | : | | | 
Db 252 EGVPVAEVEWLKNEDVIDPAQDTNFLLTIDHNLIIRQARLSDTANYTCVAKNIVAKRRST 311 

Qy 234 SAAVIVYVNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACAT 293 

: I I I I I I I I I I I : I III II I I I I I I I I : I : I I I II I I I I I I I I I II I I I I I I 
Db 312 TATVIVYVNGGWSSWAEWSPCSNRCGRGWQKRTRTCTNPAPLNGGAFCEGQAFQKTACTT 371 

Qy 294 LCPVDGSWSPWSKWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSA 353 

: I I I I I : I : I I I I I I I : I I I I I I I I I I : I I I : I I I I I : : I I I Ml: 
Db 372 VCPVDGAWTEWSKWSACSTECAHWRSRECMAPPPQNGGRDCSGTLLDSKNCTDGLCVLTL 431 

Qy 354 S GPEDVALYVGL- 1 AVAVCLVLLLLVLI LVYCRKKEGLDSDVADS S - 1 LTS GFQPVS I KP 411 

I I I I I I I : I I I : : I : I : : I I I I : I : I I I I I I I I I : I 

Db 432 ET S GDVAL YAGLVVAVFVWAVLMAVGVI VYRRNC RD FDT D I T D S S AALT GGFH P VN FKT 491 

Qy 412 SKADNPHLL — TIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHS 4 69 

: : : I I I I : III:: I : I : I I : I : I I | | | | : : : : | 
Db 492 ARPNNPQLLHPSAPPDLTASAGIYRGPVYALQDS-ADKIPMTNSPLLDPLPSLKIKVYNS 550 

Qy 470 S PTSEAEEFVS RLSTQNYFRS LPRGTSNMTY 500 

I I : : I I I : : | I INI: 

Db 551 STIGSGSGLADGADLLGVLPPGTYPGDF-SRDTHFLHLRSASLGSQHLLGLPRDPSSSVS 609 

Qy 501 GTFNFLGGRLMI PNTGI SLLI PPDAI PRGKI YEI YLTLHKPEDVRLPLA-GCQTLLS PI V 559 

Ml I I I II : I I I : II I : I I I I : I I I : : M : : I I I II : I II : I II I 
Db 610 GTFGCLGGRLSLPGTGVSLLVPNGAIPQGKFYDLYLHINKAEST-LPLSEGSQTVLSPSV 668 

Qy 560 SCGPPGVLLTRPVILAMDHCGEPSPDSWSLRLKKQSCEGSWEDVLHLGEEAPSHLYYCQL 619 

: I II I : II II I : I : M I I : I I I : : I I I : I : I I | : MM 

Db 669 TCGPTGLLLCRPWLTVPHCAEVIAGDWIFQLKTQAHQGHWEEWTLDEETLNTPCYCQL 728 

Qy 62 0 EASACYVFTEQLGRFALVGEALSVAAAKRLKLLLFAPVACTSLEYNIRVYCLHDTHDALK 67 9 

MM:: : M I : : I I : I : I I II : I : I I I II I I II :: I I I I I II Ml 
Db 729 EAKSCHILLDQLGTYVFMGESYSRSAVKRLQLAIFAPALCTSLEYSLRVYCLEDTPVALK 788 

Qy 680 EWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLWKSKLLVSYQEIPFYHIWNG 739 

11-11: III I : : I I : I II I I I I I I I I I : I I : I : I : II II I I II I II I : II I 
Db 789 EVLELERTLGGYLVEEPKPLLFKDSYHNLRLSLHDIPHAHWRSKLLAKYQEIPFYHVWNG 848 

Qy 740 TQRYLHCTFTLERVSPSTSDLACKLWVWQVEGDGQSFSINFNITKDTRFAELLALESEAG 799 

: II I I I I II I I I I : : :: II : I I I I I : I I I : : : : | Mil I 

Db 849 SQRALHCTFTLERHSLASTEFTCKVCVRQVEGEGQIFQLHTTLA-ETPAGSLDALCSAPG 907 



Qy 



800 



— VPALVGPSAFKI PFLIRQKI I SSLDPPCRRGADWRTLAQKLHLDSHLSFFASKPSPTA 857 



Db 



: : I I I I I I I I I I I I I I I I I I I I I I I I I I I : I : I :: I I : I I I I 

908 NAITTQLGPYAFKIPLSIRQKICSSLDAPNSRGNDWRLLAQKLSMDRYLNYFATKASPTG 967 



Qy 858 MILNLWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVSEAEC 898 

: I I : I I I I I : | : | : | | : | : : | : : : : : : | 
Db 968 VILDLWEARQQDDGDLNSLASALEEMGKSEMLVAMATDGDC 1008 



RESULT 10 
Q8K1S3 

ID Q8K1S3 PRELIMINARY; PRT; 945 AA. 

AC Q8K1S3; 

DT 01-OCT-2002 (TrEMBLrel. 22, Created) 

DT 01-OCT-2002 (TrEMBLrel. 22, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Netrin receptor Unc5h2 . 

GN UNC5H2 . 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleos tomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBIJTaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Engelkamp D.; 

RT "Cloning of three mouse unc-5 genes and their expression patterns at 

RT mid-gestation."; 

RL Submitted (MAY-2002) to the EMBL/ GenBank/ DDB J databases. 

DR EMBL; AJ487853; CAD32251.1; 

DR MGD; MGI: 894703; Unc5h2 . 

DR GO; GO: 0004872; F: receptor activity; IEA. 

DR GO; GO: 0007165; P: signal transduction; IEA. 

DR InterPro; IPR000488; Death. 

DR InterPro; IPR003599; Ig. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR003598; Ig_c2. 

DR InterPro; IPR000884; TSP1. 

DR InterPro; IPR008085; TSP_1. 

DR InterPro; IPR000906; ZU5. 

DR Pfam; PF00531; death; 1. 

DR Pfam; PF00047; ig; 1. 

DR Pfam; PF00090; tsp_l; 2. 

DR Pfam; PF00791; ZU5; 1. 

DR PRINTS; PR01705; TSP1REPEAT. 

DR SMART; SM00005; DEATH; 1. 

DR SMART; SM00409; IG; 2. 

DR SMART; SM00408; IGc2; 1. 

DR SMART; SM00209; TSP1; 2. 

DR SMART; SM00218; ZU5; 1. 

DR PROSITE; PS50017; DEATH_DOMAIN; 1. 

DR PROSITE; PS50835; IG_LIKE; 1. 

DR PROSITE; PS50092; TSP1; 2. 

KW Immunoglobulin domain; Receptor. 

SQ SEQUENCE 945 AA; 103738 MW; 80E896F0F0E06012 CRC64 ; 

Query Match 53.8%; Score 2578.5; DB 11; Length 945; 

Best Local Similarity 53.2%; Pred. No. 1.8e-228; 

Matches 506; Conservative 150; Mismatches 235; Indels 61; Gaps 15; 



Qy 


1 


Db 


1 


Qy 


54 


Db 


58 


Qy 


114 


Db 


118 


Qy 


174 


Db 


178 


Qy 


234 


Db 


238 


Qy 


294 


Db 


298 


Qy 


351 


Db 


358 


Qy 


401 


Db 


418 


Qy 


459 


Db 


477 


Qy 


491 


Db 


536 


Qy 


550 


Db 


595 


Qy 


609 


Db 


655 


Qy 


669 


Db 


715 


Qy 


729 


Db 


775 



MAVRPGLWPALLGIVLAAW LRG — SGAQQ SAT VAN P VP GANPDLLPHFLVEPEDV 53 

I l>: III : I I II I I : : II : I I : I I : I I : I 

MRAR S G VRS AL L LAL L L C WD P T P S LAGVD S AGQ VLPDSYPSAPAEQLPYFLLEPQDA 57 

YIVKNKPVLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQV 113 
M I I I II I I I : I I II II : I I I II I I I I III : I I : : I I ! I : I MINI 
YIVKNKPVELHCRAFPATQIYFKCNGEWVSQNDHVTQESLDEATGLRVREVQIEVSRQQV 117 

EKVFGLEEYWCQCVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPP 173 
I I I I I : I I I I I I I I I I I I I I I I :: I I I I I I I I I I I : I I I I I I I I I : : : I I I I I 
EELFGLEDYWCQCVAWSSSGTTKSRRAYIRIAYLRKNFDQEPLAKEVPLDHEVLLQCRPP 177 

EGIPPAEVEWLRNEDLVDPSLDPNVYITREHSLWRQ7^L7VDTANYTCVAKNIVj\RRRSA 233 
I I : I I I I I I I : I I I : : I I : I I : I : I : I :: I I I I I : I I I I I I I I I I I I I I : I I 1 
EGVPVAEVEWLKNEDVI DPAQDTNFLLT I DHNLI I RQARLSDTANYTCVAKNI VAKRRST 237 

SAAVIVYVNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACAT 2 93 
: I I I I I I I I I I I : I III II I I I I I I I I : I : I I I I I I I I I I I I I I I I I I I I I I 
TATVIVYVNGGWSSWAEWSPCSNRCGRGWQKRTRTCTNPAPLNGGAFCEGQAFQKTACTT 297 

LCPVDGSWSPWSKWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCV 350 

: I I I I I : I : I I I I I I I : I I I I I I I I I hill : I I I I I : : I I I | | | 
VCPVDGAWTEWSKWSACSTECAHWRSRECMAPPPQNGGRDCSGTLLDSKNCTDGLCVLNQ 357 

HSASGPEDVALYVGL-IAVAVCLVLLLLVLILVYCRKKEGLDSDVADSS-IL 400 

I I I I I I I I : I I I : : I : I : : I I I I : I : I I I I 

RT LND P KS H P LET S GDVAL YAGLVVAVFVVVAVLMAVGVI VYRRNCRD FDT DI T D S S AAL 417 

TSGFQPVSIKPSKADNPHLL — TIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSP 458 
I I I I I : I : : : I I I I : III:: I : I : | | : | : I I | | | 
TGGFHPWFKTARPNNPQLLHPSAPPDLTASAGIYRGPVYALQDS-ADKIPMTNSPLLDP 47 6 

LGGGRHTLHHSS PTSEAEEFVSRLSTQNYFRS 4 90 

I : : : : I I I : : I I I : : I I 

LPSLKIKVYNSSTIGSGSGLADGADLLGVLPPGTYPGDF-SRDTHFLHLRSASLGSQHLL 535 

-LPRGTSNMTYGTFNFLGGRLMI PNTGI SLLIPPDAIPRGKI YEI YLTLHKPEDVRLPLA 54 9 

III I : III I I I I I : I I I : I I I : I I I I : I I I : : I I : : I I I I I : 
GLPRDPSSSVSGTFGCLGGRLSLPGTGVSLLVPNGAIPQGKFYDLYLHINKAEST-LPLS 594 

-GCQTLLSPIVSCGPPGVLLTRPVI LAMDHCGEPSPDSWSLRLKKQSCEGSWEDVLHLGE 608 

I I I : I I I I : I I I I : I I I I I : I : I I I I : I I I : : I I I : I : I I 

EGSQTVLSPSVTCGPTGLLLCRPWLTVPHCAEVIAGDWIFQLKTQAHQGHWEEWTLDE 654 

EAPSHLYYCQLEASACYVFTEQLGRFALVGEALSVAAAKRLKLLLFAPVACTSLEYNIRV 668 
I : I I I I I I : I : : : I I I : : I I : I : I I I I : I : I | | I I I I I I :: I I 
ETLNTPCYCQLEAKSCHILLDQLGTYVFMGESYSRSAVKRLQLAIFAPALCTSLEYSLRV 714 

YCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLWKSKLLVSY 728 
I I I M I I I I I : : I I : I I I I : : I I : | I I I I I I I I I I I : I I : I : I : I I I I I 
YCLEDTPVALKEVLELERTLGGYLVEEPKPLLFKDSYHNLRLSLHDIPHAHWRSKLLAKY 774 

QEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWVWQVEGDGQSFSINFNITKDTRF 7 88 
I I I I I I I : M I : I I I I I I I I I I I I : : : : I I : I I I I I : I I I : : : : | 
QEIPFYHVWNGSQRALHCTFTLERHSLASTEFTCKVCVRQVEGEGQIFQLHTTLA-ETPA 833 



Qy 789 AELLALESEAG— VPALVGPSAFKIPFLIRQKIISSLDPPCRRGADWRTLAQKLHLDSHL 846 

I I I I I : : I I I I I I I I I I I I I I I I I I I I I I I I I I I : I : I 
Db 834 GSLDALCSAPGNAITTQLGPYAFKIPLSIRQKICSSLDAPNSRGNDWRLLAQKLSMDRYL 893 

Qy 847 S FFASKPS PTAMI LNLWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVSEAEC 898 

: : I I : I I I I : I I : I I I I I : I : I : I I : I : : I : : : : : : I 
Db 894 NYFATKASPTGVILDLWEARQQDDGDLNSLASALEEMGKSEMLVAMATDGDC 945 



RESULT 11 
008722 

ID 008722 PRELIMINARY; PRT; 945 AA. 

AC 008722; 

DT 01-JUL-1997 (TrEMBLrel. 04, Created) 

DT 01-JUL-1997 (TrEMBLrel. 04, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Transmembrane receptor UNC5H2 . 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus. 

OX NCBI_TaxID=10116; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=97271897; PubMed=912 6742 ; 

RA Leonardo E.D., Hinck L . , Masu M., Keino-Masu K. , Ackerman S.L., 

RA Tessier-Lavigne M.; 

RT "Vertebrate homologues of C. elegans UNC-5 are candidate netrin 

RT receptors."; 

RL Nature 386:833-838(1997). 

DR EMBL; U87306; AAB57679.1; 

DR GO; GO: 0004872; F: receptor activity; IEA. 

DR GO; GO: 0007165; P: signal transduction; IEA. 

DR InterPro; IPR000488; Death. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR003598; Ig_c2. 

DR InterPro; IPR000884; TSP1. 

DR InterPro; IPR008085; TSP_1. 

DR InterPro; IPR000906; ZU5. 

DR Pfam; PF00531; death; 1. 

DR Pfam; PF00047; ig; 1. 

DR Pfam; PF00090; tsp_l; 2. 

DR Pfam; PF00791; ZU5; 1. 

DR PRINTS; PR01705; TSP1REPEAT. 

DR SMART; SM00005; DEATH; 1. 

DR SMART; SM00408; IGc2; 1. 

DR SMART; SM00209; TSPl; 2. 

DR SMART; SM00218; ZU5; 1. 

DR PROSITE; PS50017; DEATH_DOMAIN; 1. 

DR PROSITE; PS50835; IG__LIKE; 1. 

DR PROSITE; PS50092; TSPl; 2. 

KW Immunoglobulin domain; Receptor. 

SQ SEQUENCE 945 AA; 103520 MW; 6E9C2A262E560B9B CRC64; 

Query Match 53.8%; Score 2578.5; DB 11; Length 945; 

Best Local Similarity 53.0%; Pred. No. 1.8e-228; 

Matches 509; Conservative 142; Mismatches 231; Indels 79; Gaps 17; 



Qy 1 MAVRPGLWPALLGIVLAAW LRG — S GAQQ SAT VAN P VP GAN P DLL P H FLVE P E D V 53 

III III : I I II Ml : : II : I I I I I : I I I I 

Db 1 MRARSGARGALLLALLLCWDPTPSLAGIDSGGQ ALPDSFPSAPAEQLPHFLLEPEDA 57 

Qy 54 Y I VKN K P VL L VC KAVP AT Q I F FKCN G E WVRQ VDH VI ERSTDGSSGLP TME VRI N VS RQQ V 113 

I I I I I I I I I I : I II I I I : I I I I I II I I I I : I I : : I I I I : I I I I I I I 
Db 58 YIVKNKPVELHCRAFPATQIYFKCNGEWVSQKGHVTQESLDEATGLRIREVQIEVSRQQV 117 

Qy 114 EKVFGLEEYWCQCVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPP 173 

I :: I I I I : J I I I I I I I I I I I I I I I :: I I I I I I I I I I I : I I I I I I I I I : : : I I I I I 
Db 118 EELFGLEDYWCQCVAWS S SGTTKSRRAYI RI AYLRKNFDQEPLAKEVPLDHEVLLQCRPP 177 

Qy 174 EGIPPAEVEWLRNEDLVDPSLDPNVYITREHSLWRQARLADTANYTCVAKNIVARRRSA 233 

I I : I I I I I I I : I I I : : I I : I I : I : I : I : : I I I I I : I I I I I I I I I I I I I I : I I I 
Db 178 EGVPVAEVEWLKNEDVIDPAQDTNFLLTIDHNLIIRQARLSDTANYTCVAKNIVAKRRST 237 

Qy 234 S AAVI VYVN GGW S TWT EW S VC S AS CG RGWQ KR S RS CTN PAP LN GGAFC E GQN VQ KT ACAT 293 

: I I I I I I I II I I : I III II I I I I I I I I : I : I I I I I I I I I I I I I I I I Mill I 
Db 238 TATVIVYWGGWSSWAEWSPCSNRCGRGWQKRTRTCTNPAPLNGGAFCEGQACQKTACTT 297 

Qy 294 LCPVDGSWSPWSKWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCV 350 

: I I I I I : I : I I I I I I I : I I II I I I I I hill : I II I I : : I I I III 
Db 298 VC PVDGAWTEWS KWSACSTECAHWRS RECMAP P PQNGGRDC S GT LLDS KNCTDGLCVLNQ 357 

Qy 351 HSASGPE DVALYVGL-IAVAVCLVLLLLVLILVYCRKKEGLDSDVADSS-IL 400 

: : I : I I I I I I I : I I I I : I : I : : I I I I : I : I I I I 

Db 358 RTLNDPKSRPLEPSGDVALYAGLWAVFWLAVLMAVGVIVYRRNCRDFDTDITDSSAAL 417 

Qy 401 TSGFQPVSIKPSKADNPHLL — TIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSP 458 

I I I I I : I : : I I I I : III:: I : I : II : I : I I I I I 
Db 418 TGGFHPVNFKTARPSNPQLLHPSAPPDLTASAGIYRGPVYALQDS-ADKIPMTNSPLLDP 47 6 

Qy 459 L GGG - RHTLHHSSPTSEAEEFVS 480 

I II I I I I : 
Db 477 LPSLKIKVYDSSTIGSGAGLADGADLLGVLPPGTYPGDFSRDTHFLHLRS --A 527 

Qy 481 RLSTQNYFRSLPRGTSNMTYGTFNFLGGRLMIPNTGISLLIPPDAIPRGKIYEIYLTLHK 540 

I : I : III I : II I I I I I I I I I I : I I I : I I I I : | I I : : I I : : I 
Db 528 SLGSQ-HLLGLPRDPSSSVSGTFGCLGGRLTIPGTGVSLLVPNGAIPQGKFYDLYLRINK 586 

Qy 541 PEDVRLPLA-GCQTLLSPIVSCGPPGVLLTRPVI LAMDHCGEPSPDSWSLRLKKQSCEGS 599 

I I I I : I I I : I I I I : I I I I : I I I I I : I : I I I I : I I I : : I 

Db 587 TEST-LPLSEGSQTVLSPSVTCGPTGLLLCRPWLTVPHCAEVIAGDWIFQLKTQAHQGH 645 

Qy 600 WEDVLHLGEEAPSHLYYCQLEASACYVFTEQLGRFALVGEALSVAAAKRLKLLLFAPVAC 659 

I I : I : I I I : I I I I I I : I : : : I I I : I I : I : I I I I : I : I I I I 
Db 646 WEEVWLDEETLNTPCYCQLEAKSCHILLDQLGTYVFTGESYSRSAVKRLQLAIFAPALC 705 

Qy 660 TSLEYNIRVYCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSL 719 

llll|::IMII II lll||::||: III |::||: I I I I I I I I I I I I : I I : I : 
Db 706 TSLEYSLRVYCLEDTPAALKEVLELERTLGGYLVEEPKTLLFKDSYHNLRLSLHDIPHAH 765 

Qy 720 WKSKLLVS YQEI PFYHIWNGTQRYLHCT FTLERVS PSTS DLACKLWVWQVEGDGQS FS IN 779 

I : I I I I I I I I I I I I : I I I : I : I I I I I I I I I I : : : : I I : I I I I I : I I I :: 
Db 766 WRSKLLAKYQEIPFYHVWNGSQKALHCTFTLERHSLASTEFTCKVCVRQVEGEGQIFQLH 825 



Qy 



780 FNITKDTRFAELLALESEAGVPAL — VGPSAFKI PFLIRQKIISSLDPPCRRGADWRTLA 837 



: ' I I M I I I MINIM I I II I : I I I I I I I I I I I 

Db 82 6 TTLA-ETPAGSLDALCSAPGNAATTQLGPYAFKIPLSIRQKICNSLDAPNSRGNDWRLLA 884 

Qy 838 QKLHLDSHLSFFASKPSPTAMILNLWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVSEAE 897 

I I I : I : I :: I I : I I I I : I I : I I I I I : | : | : | | : | : : | : : : : : : 
Db 885 QKLSMDRYLNYFATKASPTGVILDLWEARQQDDGDLNSLASALEEMGKSEMLVAMTTDGD 944 

Qy 898 C 898 

I 

Db 945 C 945 



RESULT 12 
Q9D398 

ID Q9D398 PRELIMINARY; PRT; 945 AA. 

AC Q9D398; 

DT Ol-JUN-2001 (TrEMBLrel. 17, Created) 

DT Ol-JUN-2001 (TrEMBLrel. 17, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE 6330415E02Rik protein. 

GN UNC5H2 OR 63304 15E02RIK . 

OS Mus mus cuius (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=C57BL/6J; TISSUE-Medulla oblongata; 

RX MEDLINE=2 108 5660; PubMed=112 17 8 51 ; 

RA Kawai J., Shinagawa A., Shibata K. , Yoshino M. , Itoh M. , Ishii Y., 

RA Arakawa T., Hara A., Fukunishi Y., Konno H., Adachi J., Fukuda S., 

RA Aizawa K., Izawa M. , Nishi K. , Kiyosawa H., Kondo S., Yamanaka I., 

RA Saito T., Okazaki Y. , Gojobori T., Bono H., Kasukawa T., Saito R. , 

RA Kadota K. , Matsuda H.A. , Ashburner M. , Batalov S., Casavant T., 

RA Fleischmann W., Gaasterland T., Gissi C, King B., Kochiwa H., 

RA Kuehl P., Lewis S., Matsuo Y. , Nikaido I., Pesole G. , Quackenbush J., 

RA Schriml L.M., Staubli F. , Suzuki R., Tomita M. , Wagner L., Washio T., 

RA Sakai K., Okido T., Furuno M., Aono H., Baldarelli R. , Barsh G., 

RA Blake J., Boffelli D . , Bojunga N. , Carninci P., de Bonaldo M.F., 

RA Brownstein M.J., Bult C, Fletcher C, Fujita M. , Gariboldi M. , 

RA Gustincich S., Hill D., Hofmann M. , Hume D.A., Kamiya M. , Lee N.H., 

RA Lyons P., Marchionni L., Mashima J., Mazzarelli J., Mombaerts P., 

RA Nordone P., Ring B., Ringwald M. , Rodriguez I., Sakamoto N., 

RA Sasaki H., Sato K., Schoenbach C, Seya T., Shibata Y. , Storch K.-F., 

RA Suzuki H . , Toyo-oka K. , Wang K.H., Weitz C, Whittaker C, Wilming L., 

RA . Wynshaw-Boris A., Yoshida K. , Hasegawa Y., Kawaji H., Kohtsuki S., 

RA Hayashizaki Y. ; 

RT "Functional annotation of a full-length mouse cDNA collection."; 

RL Nature 409:685-690(2001). 

DR EMBL; AK018177; BAB31108.1; -. 

DR MGD; MGI: 894703; Unc5h2 . 

DR GO; GO: 0007165; P: signal transduction; IEA. 

DR InterPro; IPR000488; Death. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR003598; Ig_c2 . 

DR InterPro; IPR000884; TSP1. 

DR InterPro; IPR008085; TSP__1. 



DR InterPro; IPR000906; ZU5 . 

DR Pfam; PF00531; death; 1. 

DR Pfam; PF00047; ig; 1. 

DR Pfam; PF00090; tsp_l; 2. 

DR Pfam; PF00791; ZU5; 1. 

DR PRINTS; PR01705; TSP1REPEAT. 

DR SMART; SM00005; DEATH; 1. 

DR SMART; SM00408; IGc2; 1. 

DR SMART; SM00209; TSP1; 2. 

DR SMART; SM00218; ZU5; 1. 

DR PROSITE; PS50017; DEATH_DOMAIN; 1. 

DR PROSITE; PS50835; IG_LIKE; 1. 

DR PROSITE; PS50092; TSP1; 2. 

KW Immunoglobulin domain. 

SQ SEQUENCE 945 AA; 103725 MW; 43D33B4 524E0CBF2 CRC64; 



Query Match 53.7%; Score 2572.5; DB 11; Length 945; 

Best Local Similarity 53.0%; Pred. No. 6.4e-228; 

Matches 505; Conservative 150; Mismatches 236; Indels 61; Gaps 15; 

Qy 1 MAVRPGLWPALLGIVLAAW LRG — SGAQQSATVANPVPGANPDLLPHFLVEPEDV 53 

I Ih III : I I II I I : : II : I I : I I : | | : | 

Db 1 MRAR S GVR SAL L LAL LLCWDPTPS LAGVD S AGQ VLPDSYPSAPAEQLPYFLLEPQDA 57 

Qy 54 YIVKNKPVLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQV 113 

I I I I I I I I I I : I 11111:11111111 I III : I I :: I I I I : I I I I I I I 
Db 58 YIVKNKPVELHCRAFPATQIYFKCNGEWVSQNDHVTQESLDEATGLRVREVQIEVSRQQV 117 



Qy 114 EKVFGLEEYWCQCVAWSSSGTTKSQPCAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPP 173 

I :: I I I I : I I I I I I I I I I I I I I I I :: I I I I I I I I I I I : I I I I I I I I | : : : | | | | | 

Db 118 EELFGLEDYWCQCVAWSS SGTTKS RRAYI RI AYLRKNFDQEPLAKEVPLDHEVLLQCRP P 177 

Qy 174 EGIPPAEVEWLRNEDLVDPSLDPNVYITREHSLWRQARLADTANYTCVAKNIVARRRSA 233 

I I : I I I I I I I : I I I : : I I : I I : I : I : I : : I I I II : I I I I I I I II II I II : I I I 

Db 178 EGVPVAEVEWLKNEDVIDPAQDTNFLLTIDHNLIIRQARLSDTANYTCVAKNIVAKRRST 237 

Qy 234 SAAVIWVNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACAT 293 

: I 1111111111:1 III II I I I II I I I : I : I I I I I I II I I I I I I II II II I I 

Db 238 AATVIVYVNGGWSSWAEWSPCSNRCGRGWQKRTRTCTNPAPLNGGAFCEGQAFQKTACTT 2 97 

Qy 294 LCPVDGSWSPWSKWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCV 350 

: I I I I I : I : I I I I I I I : I I I I I I I I I hill : I II I I : : I I I III 

Db 298 VC PVDGAWTEWS KWSACS TECAHWRS RECMAP P PQNGGRDCSGTLLDS KNCTDGLCVLNQ 357 

Qy 351 HSAS GPEDVAL YVGL- 1 AVAVCLVLLLLVLI LVYCRKKEGLDS D VADS S - 1 L 4 00 

I I I I I I | | : | | | : : | : : : I I I I : I : I I I I 

Db 358 RT LN D P K S H P L ET S GDVAL YAGL VVAVFVVVAVLMAE GVI VYRRN CRDFDTDITDS S AAL 417 

Qy 401 TSGFQPVSIKPSKADNPHLL — TIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSP 458 

I I I M : I : : : I I I I : III:: I : I : II : | : I I III 

Db 418 TGGFHPWFKTARPNNPQLLHPSAPPDLTASAGIYRGPVTALQDS-ADKIPMTNSPLLDP 476 

Qy 459 LGGGRHTLHHSS PTSEAEEFVSRLSTQNYFRS 490 

I : —M I : :| II : : II 

Db 477 LPSLKIKVYNSSTIGSGSGLADGADLLGVLPPGTYPGDF-SRDTHFLHLRSASLGSQHLL 535 



Qy 



491 -LPRGTSNMTYGTFNFLGGRLMI PNTGI S LLIPPDAI PRGKIYEI YLTLHKPEDVRLPLA 54 9 



Ill I : Ml I I I I I : I I I : I I I : I I I I : I I I : : I I : : I I III: 

Db 536 GLPRDPSSSVSGTFGCLGGRLSLPGTGVSLLVPNGAIPQGKFYDLYLHINKAEST-LPLS 594 

Qy 550 -GCQTLLSPIVSCGPPGVLLTRPVILAMDHCGEPSPDSWSLRLKKQSCEGSWEDVLHLGE 608 

I I I : I I I I : I I I I : I I I I I : I : I I I I : I I I : : I I I : I : I I 

Db 595 EGSQTVLSPSVTCGPTGLLLCRPWLTVPHCAEVIAGDWIFQLKTQAHQGHWEEWTLDE 654 

Qy 609 EAPSHLYYCQLEASACYVFTEQLGRFALVGEALSVAAAKRLKLLLFAPVACTSLEYNIRV 668 

I : I I I I I I : I : : : I I I : : I I : I : I I I I : I : I I I I I I I I I :: I I 
Db 655 ETLNTPCYCQLEAKSCHILLDQLGSYVFMGESYSRSAVKRLQLAIFAPALCTSLEYSLRV 714 

Qy 669 YCLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLWKSKLLVSY 728 

Ml M I I I I I : : I I : III I : : I I : I I I I I I I I I I I I : I I : I : I : I I I I I 
Db 715 YCLEDTPVALKEVLELERTLGGYLVEEPKPLLFKDSYHNLRLSLHDIPHAHWRSKLLAKY 774 

Qy 729 QEIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWVWQVEGDGQSFSINFNITKDTRF 788 

I I I I I I I : I I I : I I I I I I I I I I I I : : : : I I : I Nihil I : : : : I 
Db 775 QEIPFYHVWNGSQRALHCTFTLERHSLASTEFTCKVCVRQVEGEGQIFQLHTTIA-ETPA 833 

Qy 789 AELLALESEAG — VPALVGPSAFKI PFLIRQKI I S SLDPPCRRGADWRTLAQKLHLDSHL 846 

I I I I I : : I I I I I I I I I I II I I I I I 1111111111:1:1 
Db 834 GSLDALCSAPGNAITTQLGPYAFKIPLSIRQKICSSLDAPDSRGNDWRLLAQKLSMDRYL 8 93 

Qy 847 S F FAS K P S P T AMI LN LWEARH F PN GN L S Q LAAAVAGL GQ P DAG L FT VS EAE C 898 

: : I I : I I I I : I I : I I I I I : I : I : I I : I : : I : : : : : : I 
Db 8 94 NYFATKASPTGVILDLWEARQQDDGDLNSLASALEEMGKSEMLVAMATDGDC 945 



RESULT 13 
Q8IZJ1 

ID Q8IZJ1 PRELIMINARY; PRT; 934 AA. 

AC Q8IZJ1; 

DT 01-MAR-2003 (TrEMBLrel. 23, Created) 

DT 01-MAR-2003 (TrEMBLrel. 23, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Transmembrane receptor UNC5H2 . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=22246081; PubMed=12359238 ; 

RA Komatsuzaki K. , Dalvin S., Kinane T.B.; 

RT "Modulation of G(ialpha(2)) signaling by the axonal guidance molecule 

RT UNC5H2 

RL Biochem. Biophys . Res. Coramun. 297:898-905(2002). 

DR EMBL; AY126437; AAM95701.1; 

DR GO; GO: 0004872; F: receptor activity; IEA. 

DR GO; GO: 0007165; P: signal transduction; IEA. 

DR InterPro; IPR000488; Death. 

DR InterPro; IPR003599; Ig. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR003598; Ig_c2. 

DR InterPro; IPR000884; TSP1. 

DR InterPro; IPR008085; TSP_1. 

DR InterPro; IPR000906; ZU5. 



DR Pfam; PF00531; death; 1. 

DR Pfam; PF00047; ig; 1. 

DR Pfam; PF00090; tsp_l; 2. 

DR Pfam; PF00791; ZU5; 1. 

DR PRINTS; PR01705; TSP1REPEAT. 

DR SMART; SM00005; DEATH; 1. 

DR SMART; SM00409; IG; 2. 

DR SMART; SM00408; IGc2; 1. 

DR SMART; SM00209; TSP1; 2. 

DR SMART; SM00218; ZU5; 1. 

DR PROSITE; PS50835; IG_LIKE; 1. 

DR PROSITE; PS50092; TSP1; 2. 

KW Receptor. 

SQ SEQUENCE 934 AA; 102433 MW; 225B3F506D52B7 80 CRC64; 

Query Match 53.6%; Score 2566; DB 4; Length 934; 

Best Local Similarity 53.1%; Pred. No. 2.5e-227; 

Matches 498; Conservative 147; Mismatches 250; Indels 42; Gaps 13; 

Qy 1 MAVR P G LW PAL L G I VLAAW LRGSGAQQ-SATVANPVPGANPDLLPHFLVEPEDVYIV 56 

I II M I : I I I : I I : : II : I I : I I I I : I I I I 

Db 1 MGARSGARGALLLALLLCWDPRLSQAGTDSGSEVLPDSFPSAPAEPLPYFLQEPQDAYIV 60 

Qy 57 KNKPVLLVCKAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKV 116 

I I I I I I hi I I I I I : II I I I I I I I III : I : : I I I I : I I I I I I I I : : 
Db 61 KNKPVELRCRAFPATQIYFKCNGEWVSQNDHVTQEGLDEATGLRVREVQIEVSRQQVEEL 120 

Qy 117 FGLEEYWCQCVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPPEGI 176 

I I I I : I I I I II I I I I : I I I I I :: II : I I I I I I I I : I I I I III I : : : I I I I I I I : 
Db 121 FGLEDYWCQCVAWSSAGTTKSRRAYVRIAYLRKNFDQEPLGKEVPLDHEVLLQCRPPEGV 180 

Qy 177 PPAEVEWLRNEDLVDPSLDPNWITREHSLWRQARLADTANYTCVAKNIVARRRSASAA 236 

I I I I I I I : I I I : : I I : I I : I : I : I :: I I I I I : I I I I I I I I I I I I I I : I I I : I 
Db 181 PVAEVEWLKNEDVIDPTQDTNFLLTIDHNLIIRQARLSDT7KNYTCVAKNIVAKRRSTTAT 240 

Qy 237 VIVYVNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCP 296 

I I I I I I I I I I : I III II I I I I I I I I : I : I I I I I I I I I I I I I I I I I I I I I I : I I 
Db 241 VI VYVN GGW S S WAE W S P C S N RC G RGWQ KRT RT CTN PAP LN GGAFC E GQAFQ KT ACT T I C P 300 

Qy 297 VDGSWSPWSKWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSASGP 356 

111:1:1111111 : I I I I I I I I I I : I I I : I I I I I : : I I I I I : 
Db 301 VDGAWTEWSKWSACSTECAHWRSRECMAPPPQNGGRDCSGTLLDSKNCTDGLCMQMLEAS 360 

Qy 357 EDVALYVGL- 1 AVAVCLVXLLLVLI LVYCRKKEGLDSDVADS S - 1 LT SGFQPVS I KPSKA 414 

I I I I I I : I : I : : I : I : : I I I I : I : I I I I I I I I I : I :: 

Db 361 GDAALYAGLWAIFVWAI LMAVGVVVYRRNCRDFDTDITDSSAALTGGFHPVNFKTARP 420 

Qy 415 DNPHLL — TIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLGGGRHTLHHSSPT 472 

I I II : : I I I : : I : I : II : I : | | MM : : : | | I 
Db 421 SNPQLLHPSVPPDLTASAGIYRGPVYALQDS-TDKIPMTNSPLLDPLPSLKVKVYSSSTT 479 

Qy 473 SEAEEFVSRLSTQNY FRS LPRGTSNMTYGTFN 504 

: : : : I I | | III : Ell 

Db 480 GSGPGLADGADLLGVLPPGTYPSDFARDTHFLHLRSASLGSQQLLGLPRDPGSSVSGTFG 539 



QY 



505 FLGGRLMIPNTGISLLIPPDAIPRGKIYEIYLTLHKPEDVRLPLA-GCQTLLSPIVSCGP 563 
I I I I I I I I I : I I I : I I I M I I I M I I : : I I I I I : I I I : II I I : I II 



Db 540 CLGGRLSIPGTGVSLLVPNGAIPQGKFYEMYLLINKAEST-LPLSEGTQTVLSPSVTCGP 598 

Qy 564 PGVLLTRPVI IAMDHCGEPSPDSWSLRLKKQSCEGSWEDVLHLGEEAPSHLYYCQLEASA 623 

I : I I I I I I I I I I I I I : I I I : : I I I : I : I I I : I I I I I I 
Db 599 TGLLLCRPVI LTMPHCAEVSARDWIFQLKTQAHQGHWEEVVTLDEETLNTPCYCQLEPRA 658 

Qy 624 CYVFTEQLGRFALVGEALSVAAAKRLKLLLFAPVACTSLEYNIRVYCLHDTHDALKEWQ 683 

Is: : I I I : I I : I : I I I I : I : I I I I I I I I I : : I I I I I I I Mill:: 
Db 659 CHILLDQLGTYVFTGESYSRSAVKRLQLAVFAPALCTSLEYSLRVYCLEDTPVALKEVLE 718 

QY 684 LEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLWKSKLLVSYQEIPFYHIWNGTQRY 743 

II: I I I I : : I I : I I I I I I I I I I I I : I I : I : I : I I I I I I I I I I I I I I : I : I : 
Db 719 LERTLGGYLVEEPKPLMFKDSYHNLRLSLHDLPHAHWRSKLIAKYQEIPFYHIWSGSQKA 778 

Qy 744 LHCTFTLERVSPSTSDIACKLWVWQVEGDGQSFSINFNITKDTRFAELLALESEAG — VP 801 

MINIMI | :::M I I : I I I I I : I I I : : : M I II | | 

Db 779 LHCTFTLERHSLASTELTCKICVRQVEGEGQIFQLHTTLA-ETPAGSLDTLCSAPGSTVT 837 

Qy 802 ALVGPSAFKI PFLIRQKIISSLDPPCRRGAI)WRTLAQKLHLDSHLSFFASKPSPTAMILN 861 

: I I I I I I I Mill : I I I I II Ml Mill : I : I : : M : I III MM 
Db 838 TQLGPYAFKIPLSIRQKICNSLDAPNSRGNDWRMLAQKLSMDRYLNYFATKASPTGVILD 897 

Qy 862 LWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVSEAEC 898 

I I I I : I : I : II : I : : I : : : : : : | 

Db 898 LWEALQQDDGDLNSLASALEEMGKSEMLVAVATDGDC 934 



RESULT 14 
Q86SN3 

ID Q86SN3 PRELIMINARY; PRT; 945 AA. 

AC Q86SN3; 

DT 01-JUN-2003 (TrEMBLrel. 24, Created) 

DT 01-JUN-2003 (TrEMBLrel. 24, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE P53-regulated receptor for death and life. 

GN P53RDL1. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE-22533857; PubMed=12598906; 

RA Tanikawa C, Matsuda K., Fukuda S., Nakamura Y. , Arakawa H.; 

RT "p53RDLl regulates of p53-dependent apoptosis . " ; 

RL Nat. Cell Biol. 5:216-223(2003). 

DR EMBL; AB096256; BAC57998.1; -. 

DR GO; GO: 0004872; F: receptor activity; IEA. 

DR GO; GO: 0007165; P: signal transduction; IEA. 

DR InterPro; IPR000488; Death. 

DR InterPro; IPR003599; Ig. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR003598; Ig__c2. 

DR InterPro; IPR000884; TSP1. 

DR InterPro; IPR008085; TSP_1. 

DR InterPro; IPR000906; ZU5 . 

DR Pfam; PF00531; death; 1. 



DR Pfam; PF00047; ig; 1. 

DR Pfam; PF00090; tsp_l; 2. 

DR Pfam; PF00791; ZU5; 1. 

DR PRINTS; PR01705; TSP1REPEAT. 

DR SMART; SM00005; DEATH; 1. 

DR SMART; SM00409; IG; 2. 

DR SMART; SM00408; IGc2; 1. 

DR SMART; SM00209; TSP1; 2. 

DR SMART; SM00218; ZU5; 1. 

DR PROSITE; PS50835; IG_LIKE; 1. 

DR PROSITE; PS50092; TSP1; 2. 

KW Receptor. 

SQ SEQUENCE 945 AA; 103637 MW; 56064E335F3234 4 7 CRC64; 

Query Match 53.4%; Score 2558.5; DB 4; Length 945; 

Best Local Similarity 52.7%; Pred. No. 1.3e-226; 

Matches 501; Conservative 148; Mismatches 244; Indels 57; Gaps 15; 

Qy 1 MAVRPGLWPALLGIVLAAW LRGSGAQQ-SATVANPVPGANPDLLPHFLVEPEDVYIV 56 

I II III : I I I : I I : : II : I I : I I I I : I I I I 

Db 1 MGARSGARGALLLALLLCWDPRLSQAGTDSGSEVLPDSFPSAPAEPLPYFLQEPQDAYIV 60 

Qy 57 KN K P VLLVC KAVP AT Q I F FKCN GEWVRQVDHVI ERSTDGSSGLP TMEVRI NVS RQQVE KV 116 

I I I I I I I : I I I I I I : I I I I I I I I I III : I : : I I I I : I I I I I I I I : : 
Db 61 KNKPVELRCRAFPATQIYFKCNGEWVSQNDHVTQEGLDEATGLRVREVQIEVSRQQVEEL 120 

Qy 117 FGLEEYWCQCVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPPEGI 17 6 

I I I I : I I I I I I I I I I : I I I I I :: I I : I I I I I I I I : I I I I III I : : : I I I I I I I : 
Db 121 FGLEDYWCQCVAWSSAGTTKSRRAYVRIAYLRKNFDQEPLGKEVPLDHEVLLQCRPPEGV 180 

Qy 177 PPAEVEWLRNEDLVDPSLDPNWITREHSLWRQARIADTT^JYTCVAKNIVARRRSASAA 236 

I I I I I I I : I I I : : I I : I I : I : I : I : : I I I I I : I I I I I I I I I I I I I I : I I I : I 
Db 181 PVAEVEWLKNEDVIDPTQDTNFLLTIDHNLIIRQARLSDTANYTCVAKNIVAKRRSTTAT 240 

Qy 237 VIVTVNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCP 296 

I I I I I I I I I I : I III II I I I I I I I I : I : I I I I I I I I I I I I I I I I II I I I I : I I 

Db 241 VIVYWGGWSSWAEWSPCSNRCGRGWQKRTRTCTNPAPLNGGAFCEGQAFQKTACTTICP 300 

Qy 297 VDGSWSPWSKWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHS 352 

lll:|: I Mill I :| 1 I I I I J I I I : I I I : I I I I I :: I I I II: : 
Db 301 VDGAWTEWS KWS ACSTECAHWRS RECMAP P PQNGGRDC S GTLLDS KNCTDGLCMQNKKTL 360 

Qy 353 ASGPEDVALYVGL- 1 AVAVCLVLLLLVLI LVYCRKKEGLDS DVADS S- 1 LT 401 

III I I I I I I : I : l 4 : : I : I : : I I I I : I s I I I II 

Db 361 SDPNSHLLEASG — DAAL YAGL WAI FVWAI LMAVGVWYRRN CRDFDTDITDS S AALT 418 

Qy 4 02 SGFQPVSIKPSKADNPHLL — TIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPL 459 

I I I I : I :: I I I I :: I I I : : I : I : I I : I : I I MM 
Db 419 GGFHPVNFKTARPSNPQLLHPSVPPDLTASAGIYRGPVYALQDS-TDKIPMTNSPLLDPL 477 

Qy 460 GGGRHTLHHSSPT SEAEEFVS RLSTQNY FRS L 491 

: :: I I I :: : : | | | | | 

Db 478 PSLKVKVYSSSTTGSGPGLADGADLLGVLPPGTYPSDFARDTHFLHLRSASLGSQQLLGL 537 

Qy 492 PRGTSNMTYGTFNFLGGRLMI PNTGISLLIPPDAIPRGKIYEIYLTLHKPEDVRLPLA-G 550 

II : Ml I I I I I I I I I : I I I : I I I I : I I I I : I I : : I I III: 

Db 538 PRDPGSSVSGTFGCLGGRLSIPGTGVSLLVPNGAIPQGKFYEMYLLINKAEST-LPLSEG 596 



Qy 551 CQTLLSPIVSCGPPGVLLTRPVI LAMDHCGEPSPDSWSLRLKKQSCEGSWEDVLHLGEEA 610 

I I : I I I I : I I I I : I I I I I I I I I I I I I : I I I : : I I I : I : I I I 
Db 597 TQTVLS PSVTCGPTGLLLCRPVI LTMPHCAEVSARDWI FQLKTQAHQGHWEEWTLDEET 656 

Qy 611 PSHLYYCQLEASACYVFTEQLGRFALVGEALSVATyVKRLKLLLFAPVACTSLEYNIRVYC 670 

: I I I I I ll:= : I I I : I I : I : I I I I : I : I I I I I I I I I :: I I I I 
Db 657 LNTPCYCQLEPRACHILLDQLGTWFTGESYSRSAVKRLQLAVFAPALCTSLEYSLRVYC 716 

Qy 671' LHDTHDALKEWQLEKQLGGQLI QEPRVLH FKDS YHNLRLS I HDVP S S LWKS KLLVS YQE 730 

I II I I I I I : : I I : III I : : I I : I I I I I I I I I I I I : I I : I : I : I I I I III 
Db 717 LEDTPVALKEVLELERTLGGYLVEEPKPLMFKDS YHNLRLS LHDLPHAHWRSKLLAKYQE 776 

Qy 731 IPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWVWQVEGDGQSFSINFNITKDTRFAE 790 

I I I I I I I : I : I : I I I I I I I I I I : : : : I II : I I I I I : I I I : : : : I 
Db 777 IPFYHIWSGSQKALHCTFTLERHSLASTELTCKICVRQVEGEGQIFQLHTTLA-ETPAGS 835 

Qy 791 LLALESEAG — VPALVGP SAFKI P FLI RQKI I S S LDPPCRRGADWRTLAQKLHLDSHLS F 848 

I II I I : I I I I I I I 11111:1111 I I I M I | M | : | : | : : 
Db 836 LDTLCSAPGSTVTTQLGPYAFKIPLSIRQKICNSLDAPNSRGNDWRMLAQKLSMDRYLNY 895 

Qy 849 FASKPSPTAMILNLWEARHFPNGNLSQLAAAVAGLGQPDAGLFTVSEAEC 898 

I I : I I I I : I I : I I I I : I : I : I I : I : : I : : : : : : | 

Db 8 96 FATKASPTGVILDLWEALQQDDGDLNSLASALEEMGKSEMLVAVATDGDC 94 5 



RESULT 15 
Q8K1S2 

ID Q8K1S2 PRELIMINARY; PRT; 956 AA. 

AC Q8K1S2; 

DT 01-OCT-2002 (TrEMBLrel. 22, Created) 

DT 01-OCT-2002 (TrEMBLrel. 22, Last sequence update) 

DT 01-OCT-2003 (TrEMBLrel. 25, Last annotation update) 

DE Netrin receptor Unc5h4 . 

GN UNC5H4 . 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Engelkamp D. ; 

RT "Cloning of three mouse unc-5 genes and their expression patterns at 

RT mid-gestation."; 

RL Submitted (MAY-2002) to the EMBL/GenBank/DDB J databases. 

DR EMBL; AJ487854; CAD32252.1; -. 

DR MGD; MGI: 2389364; Unc5h4 . 

DR GO; GO: 0004872; F: receptor activity; IEA. 

DR GO; GO: 0004867; F: serine protease inhibitor activity; IEA. 

DR GO; GO: 0007165; P: signal transduction; IEA. 

DR InterPro; IPR000488; Death. 

DR InterPro; IPR003599; Ig. 

DR InterPro; IPR007110; Ig-like. 

DR InterPro; IPR003598; Ig_c2. 

DR InterPro; IPR000215; Serpin. 

DR InterPro; IPR000884; TSP1. 

DR InterPro; IPR008085; TSP_1. 



DR InterPro; IPR000906; ZU5. 

DR- Pfam; PF00531; death; 1. 

DR Pfam; PF00047; ig; 1. 

DR Pfam; PF00090; tsp_l; 2. 

DR Pfam; PF00791; ZU5 ; 1. 

DR PRINTS; PR01705; TSP1REPEAT. 

DR SMART; SM00005; DEATH; 1. 

DR SMART; SM00409; IG; 1. 

DR SMART; SM00408; IGc2; 1. 

DR SMART; SM00209; TSP1; 2. 

DR SMART; SM00218; ZU5 ; 1. 

DR PROSITE; PS50835; IG_LIKE; 1. 

DR PROSITE; PS00284; SERPIN; 1. 

DR PROSITE; PS50092; TSP1; 2. 

KW Immunoglobulin domain; Receptor. 

SQ SEQUENCE 956 AA; 106351 MW; DFDF07839C10C68D CRC64; 

Query Match 45.9%; Score 2200; DB 11; Length 956; 

Best Local Similarity 45.5%; Pred. No. 1.6e-193; 

Matches 431; Conservative 159; Mismatches 280; Indels 78; Gaps 15; 

Qy 8 WPALLGIVLAAWLRGS GAQQSATVANPVPGANPDLLPHFLVEPEDVYIVKNKPVLL 63 

I M: I I : II I : : : I I i I I I I : I I I I I I : I : I : I 

Db 15 WLPWLGLFF — WAAGAAAARGADGSEILPDSIPSA-PGTLPHFIEEPEDAYIIKSNPIAL 71 

Qy 64 VCKAVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLEEYW 123 

Ml M I I I I I I I I I I I I : I I I I I I I I I II Mhlllll I 1:11 
Db 72 RCKARPAMQIFFKCNGEWVHQNEHVSEESLDESSGLKVREVFINVTRQQVEDFHGPEDYW 131 

Qy 124 CQCVAWSSSGTTKSQKAYIRIARLRKNFEQEPLAKEVSLEQGIVLPCRPPEGIPPAEVEW 183 

I I I I I I I I I : I I : I I : I I I I I I I I I I : I : I I : I III I I I I I I : I Mill 
Db 132 CQCVAWSHLGT S KS RKASVRI AYLRKNFEQDPQGREVP I EGMI VLHCRP PEGVPAAEVEW 191 

Qy 184 LRNEDLVDPSLDPNVTITREHSLWRQARLADTANYTCVAKNIVARRRSASAAVIVTWG 243 

I : I I : : I I I : : I : I : : I I I I I : I : I I I I : I I I I I : I I I II I : I I I I I 
Db 192 LKNEEPIDSEQDENIDTR7VDHNLIIRQARLSDSGNYTCMAANIVAKRRSLSATVWYVNG 251 

Qy 244 GWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATLCPVDGSWSP 303 

111:11111 I : I I I I I I I I I I : I I I I I I I I I I I I I I I : I I I I I II I I I I I 
Db 252 GWSSWTEWSACNVRCGRGWQKRSRTCTNPAPLNGGAFCEGMSVQKITCTALCPVDGSWEV 311 

Qy 304 WSKWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCV 350 

I I : I I I : I I I I I I : I I I I I I : I : I : : I I I I I : ' 
Db 312 WSEWSVCSPECEHLRIRECTAPPPRNGGKFCEGLSQESENCTDGLCILDKKPLHEIKPQR 371 

Qy 351 HSASGPEDVALYVGLIAVAVCLVLLLLVLILVYCRKKEGLDSDVADSSILTSGFQPV 407 

I : I I I I I II I : I : : : : I I II I I I I I I I I 

Db 372 WSRRGIENASDIALYSGL-GAAWAVAVLVIGVTLYRRSHSDYGVDVIDSSALTGGFQTF 430 

Qy 4 08 SIKPSKADNPHLL--TIQPDLSTTTTTYQGSLCPRQDGPSPKFQLTNGHLLSPLG 4 60 

: I : I M : I I II I : I I I : I III I : I I : I I 
Db 431 NFKTVRQGNSLLLNPAMQPDL-TVSRTYSGPIC-LQD-PLDKELMTESSLFNPLSDIKVK 487 

Qy 461 GGRH TLHHSSPTSEAEEFVSRLSTQNYFR 489 

II I : I : I : : I I 
Db 488 VQSSFMVSLGVSERAEYHGKNHSGTFPHGNNRGFSTIHPRNKT PYIQNLS 537 



Qy 490 SLPRGTSNMTYGTFNFLGGRLMIPNTGISLLIPPDAIPRGKIYEIYLTLHKPEDVRLPLA 549 

Ml I I II I I I II : : I I I I : I I I I I III : I I I : : : : : I I 

Db 538 SLPTRTELRTTGVFGHLGGRLVMPNTGVSLLIPHGAIPEENSWEIYMSINQGEP-SLQSD 596 

Qy 550 GCQTLLSPIVSCGPPGVLLTRPVILAMDHCGEPSPDSWSLRLKKQSCEGSWEDVLHLGEE 609 

I : I I I I I : I I I I : I : I I I : I I : I : I : : I I I : : : I I I : I : : : I 
Db 597 GSEVLLSPEVTCGPPDMLVTTPFALTIPHCADVSSEHWNIHLKKRTQQGKWEEVMSVEDE 656 

Qy 610 APSHLYYCQLEASACYVFTEQLGRFALVGEALSVAAAKRLKLLLFAPVACTSLEYNIRVY 669 

: I Ml: I I : I : I : I I I I : : I I : I I : : I : : I I I : I I : I I I 
Db 657 STS — CYCLLDPFACHVLLDSFGTYALTGEPITDCAVKQLKVAVFGCMSCNSLDYNLRVY 714 

Qy 670 CLHDTHDALKEWQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLWKSKLLVSYQ 72 9 

I : : I 1:11: I : I I I I : : I I : : I I I I : : I : : I : I : I I I : I : | 
Db 715 CVDNTPCAFQEVISDERHQGGQLLEEPKLLHFKGNTFSLQVSVLDIPPFLWRIKPFTACQ 774 

Qy 730 EIPFYHIWNGTQRYLHCTFTLERVSPSTSDLACKLWVWQVEGDGQSFSINFNITKDTRFA 789 



iii • i • • • i i i i • i i i • i • I • I • i i • • i • - i [ • -i - i 

Db 775 EVPFSRVWSSNRQPLHCAFSLERYTPTTTQLSCKICIRQLKGHEQILQVQTSILESERET 834 

Qy 790 ELLALESEAGVPALVGPSAFKIPFLIRQKIISSLDPPCRRG7VDWRTLAQKLHLDSHLSFF 849 

: : : II I I II I I I : I I I : I : : I I : I I I : I I I I : : : I I : I 
Db 835 ITFFAQEDSTFPAQTGPKAFKIPYSIRQRICATFDTPNAKGKDWQMLAQKNSINRNLSYF 894 

Qy 850 AS K P S P T AMI LN LWEARH FPN GN L S Q LAAAVAGL GQ P DAG L FT VS EAE 897 

I : : I I : I : I I I I I II I I : I : I III: : I : I : : I : 

Db 895 ATQSSPSAVILNLWEARHQQDGDLDSLACALEEIGRTHTKLSNITEPQ 942 



Search completed: July 12, 2004, 23:00:43 
Job time : 96 sees 
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BAI1_ 
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Q9ukp4 
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homo sapien 
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AT15_MOUSE 
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2 
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CO 8 A HUMAN 
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ALIGNMENTS 



RESULT 1 
BAI INHUMAN 

ID BAI 1_HUMAN STANDARD; PRT; 1584 AA. 

AC 014514; 

DT 16-OCT-2001 (Rel. 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Brain-specific angiogenesis inhibitor 1 precursor. 

GN BAIL 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Fetal brain; 

RX MEDLINE=98054121; PubMed=93 93972 ; 

RA Nishimori H., Shiratsuchi T., Urano T., Kimura Y., Kiyono K., 

RA Tatsumi K., Yoshida S., Ono M. , Kuwano M. , Nakamura Y., Tokino T.; 

RT "A novel brain-specific p53-target gene, BAI1, containing 

RT thrombospondin type 1 repeats inhibits experimental angiogenesis."; 

RL Oncogene 15:2145-2150(1997). 

RN [2] 

RP INTERACTION WITH BAPl . 



RX MEDLINE-98321173; PubMed=9647739 ; 

RA Shiratsuchi T., Futamura M., Oda K., Nishimori H., Nakamura Y., 

RA Tokino T. ; 

RT "Cloning and characterization of BAI-associated protein 1: a PDZ 

RT domain-containing protein that interacts with BAIL"; 

RL Biochem. Biophys . Res. Commun . 247:597-604(1998). 

CC -!- FUNCTION: LIKELY TO BE A POTENT INHIBITOR OF ANGIOGENESIS IN 
CC BRAIN AND MAY PLAY A SIGNIFICANT ROLE AS A MEDIATOR OF THE P53 

CC SIGNAL IN SUPPRESSION OF GLIOBLASTOMA. MAY FUNCTION IN CELL 

CC ADHESION AND SIGNAL TRANSDUCTION IN THE BRAIN. 

CC -!- SUBUNIT: INTERACTS WITH BAP1. 

CC -!- SUBCELLULAR LOCATION: INTEGRAL MEMBRANE PROTEIN. LIKELY TO BE 
CC CONCENTRATED AT CELL-CELL ADHESION SITES. 

CC -!- TISSUE SPECIFICITY: SPECIFICALLY EXPRESSED IN BRAIN . REDUCED OR NO 
CC EXPRESSION IS OBSERVED IN SOME GLIOBLASTOMA CELL LINES AND CANCER 

CC TISSUES. 

CC -!- INDUCTION: By p53. 

CC -!- DOMAIN: THE TSP1 REPEATS INHIBIT IN VIVO ANGIOGENESIS IN RAT 
CC CORNEA INDUCED BY BFGF. 

CC -!- SIMILARITY: Belongs to family 2 of G-protein coupled receptors. 

CC -!- SIMILARITY: Contains 5 TSP type-1 domains. 

CC -!- SIMILARITY: Contains 1 GPS domain. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; AB005297; BAA23647.1; 

DR PIR; T00026; T00026. 

DR Genew; HGNC:943; BAIL 

DR MIM; 602682; -. 

DR GO; GO: 0005887; C:integral to plasma membrane; TAS . 

DR GO; GO: 0005911; C : intercellular junction; TAS. 

DR GO; GO: 0005515; F:protein binding; TAS. 

DR GO; GO:0007409; P : axonogenesis ; TAS. 

DR GO; GO: 0007155; P:cell adhesion; TAS. 

DR GO; GO: 0008285; P:negative regulation of cell proliferation; TAS. 

DR GO; GO:0007422; P:peripheral nervous system development; TAS. 

DR GO; GO: 0007165; P: signal transduction; TAS. 

DR InterPro; IPR000832; GPCR_secretin . 

DR InterPro; IPR001879; hormn_receptor . 

DR InterPro; IPR000203; PKD_cys_rich . 

DR InterPro; IPR000884; TSP1. 

DR Pfam; PF00002; 7tm_2; 1. 

DR Pfam; PF01825; GPS; 1. 

DR Pfam; PF02793; HRM; 1. 

DR Pfam; PF00090; tsp_l; 5. 

DR SMART; SM00303; GPS; 1. 

DR SMART; SM00008; HormR; 1. 

DR SMART; SM00209; TSP1; 5. 

DR PROSITE; PS50221; GPS; 1. 

DR PROSITE; PS0064 9; G_PROTEIN_RECEP_F2_l ; FALSE_NEG. 

DR PROSITE; PS00650; G_PROTEIN_RECEP F2 2; FALSE NEG. 



DR 


PROSITE; 


PS50227; 


G PROTEIN 


RECEP F2 3; 1. 


DR 


PROSITE; 


PS50261; 


G PROTEIN 


_RECEP_F2_4; 1. 


DR 


PROSITE; 


PS50092; 


TSP1; 


5. 




KW 


G-protein 


coupled receptor; 


Transmembrane; Glycoprotein; Signal; 


KW 


Repeat; Cell adhesion. 






FT 


SIGNAL 
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POTENTIAL. 


FT 
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31 


1584 




BRAIN-SPECIFIC ANGIOGENESIS INHIBITOR 1. 


FT 


DOMAIN 


31 
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EXTRACELLULAR (POTENTIAL). 


FT 


TRANSMEM 


949 


969 




1 (POTENTIAL) . 


FT 


DOMAIN 


970 


980 




CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


981 


1001 




2 (POTENTIAL) . 


FT 


DOMAIN 


1002 


1008 




EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


1009 


1029 




3 (POTENTIAL) . 


FT 


DOMAIN 


1030 


1052 




CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


1053 


1073 




4 (POTENTIAL) . 


FT 


DOMAIN 


1074 


1093 




EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


1094 


1114 




5 (POTENTIAL) . 


FT 


DOMAIN 


1115 


1136 




CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


1137 


1157 




6 (POTENTIAL) . 


FT 


DOMAIN 


1158 


1166 




EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


1167 


1187 




7 (POTENTIAL) . 


FT 


DOMAIN 


1188 


1584 




CYTOPLASMIC (POTENTIAL) . 


FT 


DOMAIN 


261 


315 




TSP TYPE-1 1. 


FT 


DOMAIN 


354 


407 




TSP TYPE-1 2. 


FT 


DOMAIN 


409 


4 62 




TSP TYPE-1 3. 


FT 


DOMAIN 


467 


520 




TSP TYPE-1 4. 


FT 


DOMAIN 


522 


575 




TSP TYPE-1 5. 


FT 


DOMAIN 


881 


938 




GPS. 


FT 


DOMAIN 


1411 


1422 




POLY- PRO. 


FT 


DOMAIN 


1425 


1430 




POLY- PRO. 


FT 


SITE 


231 


233 




CELL ATTACHMENT SITE (POTENTIAL) . 


FT 


DOMAIN 


1365 


1584 




NECESSARY FOR INTERACTION WITH BAPl . 


FT 


DOMAIN 


1581 


1584 




INDISPENSABLE FOR INTERACTION WITH BAPl. 


FT 


CARBOHYD 


64 


64 




N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


401 


401 




N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


607 


607 




N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


692 


692 




N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


844 


844 




N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


877 


877 




N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


881 


881 




N-LINKED (GLCNAC. . .) ( POTENTIAL) . 


SQ 


SEQUENCE 


1584 AA; 173531 


MW; DEA8F28C77874513 CRC64; 


Query Match 




6. 


2%; 


Score 298.5; DB 1; Length 1584; 



Best Local Similarity 33.5%; Pred. No. 2.7e-14; 

Matches 78; Conservative 35; Mismatches 91; Indels 29; Gaps 11 

Qy 124 CQCVAWS S SGTTKSQKAYI RI ARLRKNFEQEPLAKEVSLEQGIVLPCRPPEGI PPAEVEW 183 

I I : I I I : : I I : : I I I : I I I I I I I I I 

Db 309 CNREACGPAGRTSSRSQSLRSTDARR REELGDEL QQFGFPA-PQTGDPAAE-EW 360 

Qy 184 LRNEDLVDP S LDPNVYI TREH S LVVRQAJRLADTAN YTCVAXNI V7VRRRSASAAVI VYVNG 243 

: II : I I I : : : I : : : I : : : I : | 

Db 361 — SPWSVCSSTCGEGWQTR TRFCVSSSYSTQCSGPLREQRLCNNSAVCPVHG 410 

Qy 244 GW S TWT EW S VC S AS C G RGWQ KRSRSCTN PAP LN G GAFC E GQN VQ KT AC - AT L C P VDG 2 99 

I I : I I : I I : : I I I I : : I : I : I I II III I I III I I | 

Db 411 AWDEWSPWSLCSSTCGRGFRDRTRTCR— PPQFGGNPCEGPEKQTKFCNIALCPGRAVDG 4 68 



Qy 300 SWSPWSKWSACGLDCT HWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLC 34 9 

: I : I I I I I I I : I : I I I : I : I I I I I I : : I I : I I 

Db 469 NWNEWSSWSACSASCSQGRQQRTRECNGPS--YGGAECQGHWVETRDCFLQQC 519 

RESULT 2 
TSP2_HUMAN 

ID TSP2_HUMAN STANDARD; PRT; 1172 AA. 

AC P35442; 

DT 01-JUN-1994 (Rel. 29, Created) 

DT 01-JUN-1994 (Rel. 29, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Thrombospondin 2 precursor. 

GN THBS2 OR TSP2. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=94010892; PubMed=8 406456; 

RA Labell T.L., Byers P.H.; 

RT "Sequence and characterization of the complete human thrombospondin 2 

RT cDNA: potential regulatory role for the 3 1 untranslated region."; 

RL Genomics 17:225-229(1993). 

RN [2] 

RP SEQUENCE OF 560-1172 FROM N.A. 

RC TISSUE-Fibroblast; 

RX MEDLINE=92217961; PubMed=1559694 ; 

RA Labell T.L., McGookey Milewicz D.J., Disteche CM., Byers P.H.; 

RT "Thrombospondin II: partial cDNA sequence, chromosome location, and 

RT expression of a second member of the thrombospondin gene family in 

RT humans."; 

RL Genomics 12:421-429(1992). 

RN [3] 

RP THROMBOSPONDIN REPEATS DISULFIDE BONDS. 

RX MEDLINE=21588233; PubMed=11590138 ; 

RA Misenheimer T.M., Hahr A. J., Harms A.C., Annis D.S., Mosher D.F.; 

RT "Disulfide connectivity of recombinant C-terminal region of human 

RT thrombospondin 2 . " ; 

RL J. Biol. Chem. 276:45882-45887(2001). 

CC -!- FUNCTION: Adhesive glycoprotein that mediates cell-to-cell and 

CC cell-to-matrix interactions. Can bind to fibrinogen, fibronectin, 

CC laminin and type V collagen. 

CC -!- SUBUNIT: Homotrimer; disulf ide-linked. 

CC -!- SIMILARITY: Belongs to the thrombospondin family. 

CC -!- SIMILARITY: Contains 1 VWFC domain. 

CC -!- SIMILARITY: Contains 3 EGF-like domains. 

CC -!- SIMILARITY: Contains 3 TSP type-1 domains. 

CC -!- SIMILARITY : Contains 7 TSP type-3 domains. 

CC -!- SIMILARITY: Contains 1 TSP N-terminal (TSPN) domain. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 



cc 


modified 


and this statement is not removed. 


Usage by and for commercial 


cc 


entities 


requires a license agreement (See http://www.isb-sib.ch/announce/ 


cc 


or send an email to license@isb-sib.ch). 




cc 














DR 


EMBL; L12350; AAA03703.1; -. 




DR 


EMBL; M81339; -; NOT ANNOTATED CDS. 




DR 


PIR; A47379; TSHUP2 . 




DR 


HSSP; P00740; 1EDM. 




DR 


Genew; HGNC: 11786; THBS2. 




DR 


MIM; 188061; -. 




DR 


GO; GO: 0008201; Frheparin binding; TAS . 




DR 


InterPro; 


IPR001881; EGF Ca. 




DR 


InterPro; 


IPR006209; EGF like. 




DR 


InterPro ; 


IPR006210; IEGF. 




DR 


InterPro; 


IPR000884; TSP1. 




DR 


InterPro; 


IPR008085; TSP 1. 




DR 


InterPro; 


IPR003367; tsp_3 . 




DR 


InterPro ; 


IPR008859; TSPC. 




DR 


InterPro; 


IPR003129; TSPN. 




DR 


InterPro; 


IPR001007; VWF C. 




DR 


Pfam; PF00008; EGF; 2. 




DR 


Pfam; PF00090; tsp_JL; 3. 




DR 


Pfam; PF02412; tsp_3; 13. 




DR 


Pfam; PF05735; TSPC; 1. 




DR 


Pfam; PF02210; TSPN; 1. 




DR 


Pfam; PF00093; vwc; 1. 




DR 


PRINTS; PR01705; TSP1REPEAT . 




DR 


SMART; SM00181; EGF; 3. 




DR 


SMART; SM00209; TSP1; 3. 




DR 


SMART; SM00210; TSPN; 1. 




DR 


SMART; SM00214; VWC; 1. 




DR 


PROSITE; 


PS00022; EGF_1; FALSE_NEG. 




DR 


PROSITE; 


PS01186; EGF__2; 1. 




DR 


PROSITE; 


PS50026; EGF 3; 2. 




DR 


PROSITE; 


PS50092; TSP1; 3. 




DR 


PROSITE; 


PS01208; VWFC 1; 1. 




DR 


PROSITE; 


PS50184; VWFC_2; 1. 




KW 


Glycoprot 


ein; Cell adhesion; Calcium-binding; 


Heparin-binding; Repeat; 


KW 


EGF-like 


domain; Signal. 




FT 


SIGNAL 


1 18 POTENTIAL. 




FT 


CHAIN 


19 1172 THROMBOSPONDIN 2 




FT 


DOMAIN 


19 215 TSP N-TERMINAL. 




FT 


DOMAIN 


19 232 HEPARIN-BINDING 


(POTENTIAL) . 


FT 


DOMAIN 


318 375 VWFC. 




FT 


DOMAIN 


381 431 TSP TYPE-1 1. 




FT 


DOMAIN 


437 492 TSP TYPE-1 2. 




FT 


DOMAIN 


494 549 TSP TYPE-1 3. 




FT 


DOMAIN 


549 589 EGF-LIKE 1. 




FT 


DOMAIN 


590 647 EGF-LIKE 2, CALCIUM-BINDING (POTENTIAL) . 


FT 


DOMAIN 


648 692 EGF-LIKE 3. 




FT 


DOMAIN 


725 760 TSP TYPE-3 1. 




FT 


DOMAIN 


761 783 TSP TYPE-3 2. 




FT 


DOMAIN 


784 819 TSP TYPE-3 3. 




FT 


DOMAIN 


820 842 TSP TYPE-3 4. 




FT 


DOMAIN 


843 880 TSP TYPE-3 5. 




FT 


DOMAIN 


881 916 TSP TYPE-3 6. 




FT 


DOMAIN 


917 952 TSP TYPE-3 7. 





FT 


DOMAIN 


953 


1172 


C- TERMINAL . 




FT 


SITE 


928 


930 


CELL ATTACHMENT SITE 


(POTENTIAL) . 


FT 


DISULFID 


266 


266 


INTERCHAIN (PROBABLE) 




FT 


DISULFID 


270 


270 


INTERCHAIN (PROBABLE) 




FT 


DISULFID 


393 


425 


BY 


SIMILARITY. 




FT 


DISULFID 
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430 


BY 


SIMILARITY. 




FT 


DISULFID 


408 


415 


BY 


SIMILARITY. 




FT 


DISULFID 


449 
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BY 


SIMILARITY. 




FT 


DISULFID 


453 
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SIMILARITY. 




FT 
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464 
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BY 


SIMILARITY. 




FT 
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BY 


SIMILARITY. 




FT 


DISULFID 


510 


548 


BY 


SIMILARITY. 




FT 


DISULFID 


521 


533 


BY 


SIMILARITY. 




FT 


DISULFID 


553 


564 


BY 


SIMILARITY. 




FT 


DISULFID 


558 


574 


BY 


SIMILARITY. 




FT 


DISULFID 


577 


588 


BY 


SIMILARITY. 




FT 


DISULFID 


594 


610 


BY 


SIMILARITY. 




FT 


DISULFID 


601 


619 


BY 


SIMILARITY. 




FT 


DISULFID 


622 


646 


BY 


SIMILARITY. 




FT 


DISULFID 


652 


665 


BY 


SIMILARITY. 




FT 


DISULFID 


659 


678 


BY 


SIMILARITY. 




FT 


DISULFID 


680 


691 


BY 


SIMILARITY. 




FT 


DISULFID 


707 


715 








FT 


DISULFID 


720 


740 








FT 


DISULFID 


756 


776 








FT 


DISULFID 


779 


799 








FT 


DISULFID 


815 


835 








FT 


DISULFID 


838 


858 








FT 


DISULFID 


876 


896 








FT 


DISULFID 


912 


932 








FT 


DISULFID 


948 


1169 








FT 


CARBOHYD 
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151 


N-LINKED (GLCNAC. . . 


) (POTENTIAL) . 


FT 


CARBOHYD 
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316 


N-LINKED (GLCNAC. . . 


) (POTENTIAL) . 


FT 


CARBOHYD 
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330 


N-LINKED (GLCNAC. . . 


) (POTENTIAL) . 


FT 


CARBOHYD 


457 


457 


N-LINKED (GLCNAC. . . 


) (POTENTIAL) . 


FT 


CARBOHYD 


584 


584 


N-LINKED (GLCNAC. . . 


) (POTENTIAL) . 


FT 


CARBOHYD 
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710 


N-LINKED (GLCNAC. . . 


) (POTENTIAL) . 


FT 


CARBOHYD 


1069 


1069 


N-LINKED (GLCNAC. . . 


) (POTENTIAL) . 


SQ 


SEQUENCE 


1172 


AA; 129955 


MW; 


2AC7BB230E44C6F5 


CRC64; 



Query Match 6.2%; Score 296.5; DB 1; Length 1172; 

Best Local Similarity 30.5%; Pred. No. 2.5e-14; 

Matches 78; Conservative 28; Mismatches 105; Indels 45; 



QY 



Db 



Gaps 



209 RQARLADT7\NYTCVAKNIVARRRSASAA-VIVTWGGWSTWTEWSVCSASCGRGWQKRSR 267 

: : I I : I I : : I I II : : I I I I I : I I I I : I I I I I 

403 QRGRSCDVTSNTCLGPSIQTRACSLSKCDTRIRQDGGWSHWSPWSSCSVTCGVGNITRIR 462 



Qy 

Db 

Qy 

Db 

Qy 



268 SCTNPAPLNGGAFCEGQNVQKTAC-ATLCPVDGSWSPWSKWSACGLDCT HWRSRECS 323 

I : I I M I : I : II I I : I I I I I I I I I I I : I I : I I : 

463 LCNS PVPQMGGKNCKGS GRETKACQGAPCPI DGRWS PWS PWSACTVTCAGGI RERTRVCN 522 

324 DPAPRNGGEECQGTDLDTRNCTSDLCVHSASGPEDVALYVGLIAVAVCLVLLLLVLILVY 383 

I I : I I : I I : : I I III II 

523 SPEPQYGGKACVGDVQERQMCNKRSC PVDGCLSNPCFPGAQC 564 

384 CRKKEGLDSDVADSSILTSGFQPVSI — KPSKADNPHLLTIQPDLSTTTT TYQ 434 



I II •MM ... , 

Db 565 SSFPDGS-WSCGFCPVGFLGNGTHCEDLDECALVPDICFSTSKVPRCWTQP 615 

Qy 435 GSLC PRQDGPSP 446 

II II I I 

Db 616 GFHCLPCPPRYRGNQP 631 



RESULT 3 
SM5A__HUMAN 

ID SM5A_HUMAN STANDARD; PRT; 107 4 AA. 

AC Q13591; 060408; 

DT 30-MAY-2000 (Rel. 39, Created) 

DT 30-MAY-2000 (Rel. 39, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Semaphorin 5A precursor ( Semaphorin F) (Sema F) . 

GN SEMA5A OR SEMAF. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=98125554; PubMed=9464278 ; 

RA Simmons A.D., Puschel A.W., McPherson J.D., Overhauser J., Lovett M. ; 

RT "Molecular cloning and mapping of human semaphorin F from the Cri-du- 

RT chat candidate interval."; 

RL Biochem. Biophys . Res. Commun. 242:685-691(1998). 

RN [2] 

RP SEQUENCE OF 1-4 94 FROM N.A. 

RA Kalicki J., Harmon G.; 

RL Submitted (APR-1998) to the EMBL/ GenBank/DDB J databases. 

CC -!- FUNCTION: May act as positive axonal guidance cues. 

CC -!- SUBCELLULAR LOCATION: Type I membrane protein. 

CC -!- SIMILARITY: Belongs to the semaphorin family. 

CC -!- SIMILARITY: Contains 1 Sema domain. 

CC -!- SIMILARITY: Contains 7 TSP type-1 domains. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib.ch). 

CC 

DR EMBL; U52840; AAC09473.1; 

DR EMBL; AC004615; AAC14668.1; -. 

DR PIR; JC5928; JC5928. 

DR Genew; HGNC: 10736; SEMA5A. 

DR GO; GO: 0007155; P:cell adhesion; TAS . 

DR GO; GO: 0007267; P: cell-cell signaling; TAS. 

DR GO; GO:0007399; P : neurogenesis ; TAS. 

DR InterPro; IPR003659; Plexin-like. 

DR InterPro; IPR002165; Plexin_repeat . 

DR InterPro; IPR001627; Sema. 

DR InterPro; IPR000884; TSP1. 



DR 


InterPro 


; IPR008085; TSP 1. 






DR 


Pfam; PF01437; 


PSI; 1. 






DR 


Pfam; PF01403; 


Sema; 1. 






DR 


Pfam; PF00090; 


tsp_l; 6. 






DR 


PRINTS; 


PR01705; TSP1REPEAT 






DR 


SMART; SM00423; 


PSI; 1. 






DR 


SMART; SM00630; 


Sema; 1. 






DR 


SMART; SM00209; 


TSP1; 6. 






DR 


PROSITE; 


PS50092; TSP1; 6. 






KW 


Signal; Transmembrane; Repeat; Multigene family; 


Neurogenesis ; 


KW 


Developmental protein; Glycoprotein. 




FT 


SIGNAL 


1 


22 


POTENTIAL. 




FT 


CHAIN 


23 


1074 


SEMAPHORIN 5A. 




FT 


DOMAIN 


23 


968 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


969 


989 


POTENTIAL. 




FT 


DOMAIN 


990 


1074 


CYTOPLASMIC (POTENTIAL) . 


FT 


DOMAIN 


226 


507 


SEMA. 




FT 


DOMAIN 


540 


593 


TSP TYPE-1 1. 




FT 


DOMAIN 


595 


651 


TSP TYPE-1 2. 




FT 


DOMAIN 


653 


702 


TSP TYPE-1 3. 




FT 


DOMAIN 


707 


765 


TSP TYPE-1 4. 




FT 


DOMAIN 


784 


839 


TSP TYPE-1 5. 




FT 


DOMAIN 


841 


896 


TSP TYPE-1 6. 




FT 


DOMAIN 


897 


944 


TSP TYPE-1 7. 




FT 


CARBOHYD 


142 


142 


N-LINKED (GLCNAC. . 


.) (POTENTIAL) 


FT 


CARBOHYD 


168 


168 


N-LINKED (GLCNAC. . 


.) (POTENTIAL) 


FT 


CARBOHYD 


227 


227 


N-LINKED (GLCNAC. . 


.) (POTENTIAL) 


FT 


CARBOHYD 


277 


277 


N-LINKED (GLCNAC. . 


.) (POTENTIAL) 


FT 


CARBOHYD 


323 


323 


N-LINKED (GLCNAC. . 


.) (POTENTIAL) 


FT 


CARBOHYD 


367 


367 


N-LINKED (GLCNAC. . 


.) (POTENTIAL) 


FT 


CARBOHYD 


437 


437 


N-LINKED (GLCNAC. . 


.) (POTENTIAL) 


FT 


CARBOHYD 


536 


536 


N-LINKED (GLCNAC. . 


.) (POTENTIAL) 


FT 


CARBOHYD 


591 


591 


N-LINKED (GLCNAC. . 


.) (POTENTIAL) 


FT 


CARBOHYD 


717 


717 


N-LINKED (GLCNAC. . 


.) (POTENTIAL) 


FT 


CARBOHYD 


933 


933 


N-LINKED (GLCNAC. . 


.) (POTENTIAL) 


FT 


CONFLICT 


56 


56 


A -> V (IN REF. 2) . 




FT 


CONFLICT 


149 


149 


A -> T (IN REF. 2) . 




FT 


CONFLICT 


382 


382 


V -> M (IN REF. 2) . 




FT 


CONFLICT 


494 


494 


S -> R (IN REF. 2) . 




SQ 


SEQUENCE 


1074 


AA; 120570 


MW; EE3DB763CBE2 9407 


CRC64; 



Query Match 6.1%; Score 293; DB 1; Length 1074; 

Best Local Similarity 45.8%; Pred. No. 4.1e-14; 

Matches 54; Conservative 11; Mismatches 49; Indels 4; Gaps 2 

Qy 241 VNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATL-CPVDG 299 

I I I I I I I I I I I 111:111111 II II::: I I I I I I I I 
Db 783 VNGAWSAWTSWSQCSRDCSRGIRNRKRVCNNPEPKYGGMPCLGPSLEYQECNTLPCPVDG 842 

Qy 300 SWSPWSKWSACGLDC THWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSAS 354 

I I I I I : I I : I : I I I : I I I I I : I I : I : I II 
Db 843 VWSCWSPWTKCSATCGGGHYMRTRSCSNPAPAYGGDICLGLHTEEALCNTQPCPESWS 900 



RESULT 4 
TSP2_MOUSE 

ID TSP2 MOUSE STANDARD; PRT; 1172 AA. 



AC Q03350; 

DT 01-JUN-1994 (Rel. 29, Created) 

DT 01-JUN-1994 (Rel. 29, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Thrombospondin 2 precursor. 

GN THBS2 OR TSP2 . 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=92147683; PubMed=1371115 ; 

RA Laherty CD., O'Rourke K., Wolf F.W., Katz R. , Seldin M.F., 

RA Dixit V.M. ; 

RT "Characterization of mouse thrombospondin 2 sequence and expression 

RT during cell growth and development."; 

RL J. Biol. Chem. 267:3274-3281(1992). 

RN [2] 

RP SEQUENCE OF 1-873 FROM N.A. 

RX MEDLINE=91302287; PubMed=17 12771 ; 

RA Bornstein P., O ' Rourke K. , Wikstrom K., Wolf F.W., Katz R. , Li P., 

RA Dixit V.M. ; 

RT "A second, expressed thrombospondin gene (Thbs2) exists in the mouse 

RT genome . " ; 

RL J. Biol. Chem. 266:12821-12824(1991). 

CC -!- FUNCTION: Adhesive glycoprotein that mediates cell-to-cell and 

CC cell-to-matrix interactions. Can bind to fibrinogen, fibronectin, 

CC laminin and type V collagen. 

CC -!- SUBUNIT: Homot rimer; disul fide-linked. 

CC -!- SIMILARITY: Belongs to the thrombospondin family. 

CC - ! - SIMILARITY: Contains 1 VWFC domain. 

CC -!- SIMILARITY: Contains 3 EGF-like domains. 

CC -!- SIMILARITY: Contains 3 TSP type-1 domains. 

CC -!- SIMILARITY: Contains 7 TSP type-3 domains. 

CC -!- SIMILARITY: Contains 1 TSP N-terminal (TSPN) domain. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; L07803; AAA53064.1; -. 

DR EMBL; M64866; AAA40432.1; -. 

DR PIR; A42587; A42587. 

DR HSSP; P00740; 1EDM. 

DR MGD; MGI: 98738; Thbs2 . 

DR InterPro; IPR001881; EGF_Ca . 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR006210; IEGF. 

DR InterPro; IPR000884; TSP1. 

DR InterPro; IPR008085; TSP_1. 

DR InterPro; IPR003367; tsp_3 . 

DR InterPro; IPR008859; TSPC. 



DR 


InterPro; 


IPR003129; TSPN. 






DR 


InterPro; 


IPR001007; VWF C. 






DR 


Pfam; PF00008; EGF; 2. 






DR 


Pfam; PF00090; tsp 1; 3. 






DR 


Pfam; PF02412; tsp 3; 13. 






DR 


Pfam; PF05735; TSPC; 1. 






DR 


Pfam; PF02210; TSPN; 1. 






DR 


Pfam; PF00093; vwc; 1. 






DR 


PRINTS; PR01705; 


TSP1REPEAT. 






DR 


SMART; SM00181; EGF; 3. 






DR 


SMART; SM00209; TSP1; 3. 






DR 


SMART; SM00210; TSPN; 1. 






DR 


SMART; SM00214; VWC; 1. 






DR 


PROSITE; 


PS00022; 


EGF 1; FALSE NEG. 




DR 


PROSITE; 


PS01186; 


EGF__2 ; 1 . 






DR 


PROSITE; 


PS50026; 


EGF_3; 2. 






DR 


PROSITE; 


PS50092; 


TSP1; 3. 






DR 


PROSITE; 


PS01208; 


VWFC_1; 1. 






DR 


PROSITE; 


PS50184; 


VWFC 2; 1. 






KW 


Glycoprotein; Cell adhesion; 


Calcium-binding; 


Heparin-binding; Repeat; 


KW 


EGF-like 


domain ; 


Signal . 






FT 


SIGNAL 


1 


18 


POTENTIAL. 




FT 


CHAIN 


19 


1172 


THROMBOSPONDIN 2 




FT 


DOMAIN 


19 


215 


TSP N-TERMINAL. 




FT 


DOMAIN 


19 


232 


HEPARIN-BINDING 


( POTENTIAL) . 


FT 


DOMAIN 


318 


375 


VWFC. 




FT 


DOMAIN 


381 


431 


TSP TYPE-1 1. 




FT 


DOMAIN 


437 


4 92 


TSP TYPE-1 2. 




FT 


DOMAIN 


494 


549 


TSP TYPE-1 3. 




FT 


DOMAIN 


549 


589 


EGF-LIKE 1. 




FT 


DOMAIN 


590 


647 


EGF-LIKE 2, CALCIUM- BINDING (POTENTIAL) . 


FT 


DOMAIN 


648 


692 


EGF-LIKE 3. 




FT 


DOMAIN 


725 


760 


TSP TYPE-3 1. 




FT 


DOMAIN 


761 


783 


TSP TYPE-3 2. 




FT 


DOMAIN 


784 


819 


TSP TYPE-3 3. 




FT 


DOMAIN 


820 


842 


TSP TYPE-3 4. 




FT 


DOMAIN 


843 


880 


TSP TYPE-3 5. 




FT 


DOMAIN 


881 


916 


TSP TYPE-3 6. 




FT 


DOMAIN 


917 


952 


TSP TYPE-3 7. 




FT 


DOMAIN 


953 


1172 


C-TERMINAL. 




FT 


SITE 


928 


930 


CELL ATTACHMENT 


SITE (POTENTIAL) . 


FT 


DISULFID 


266 


266 


INTERCHAIN (PROBABLE) . 


FT 


DISULFID 


270 


270 


INTERCHAIN (PROBABLE) . 


FT 


DISULFID 


393 


425 


BY SIMILARITY. 




FT 


DISULFID 


397 


430 


BY SIMILARITY. 




FT 


DISULFID 


408 


415 


BY SIMILARITY. 




FT 


DISULFID 


449 


486 


BY SIMILARITY. 




FT 


DISULFID 


453 


491 


BY SIMILARITY. 




FT 


DISULFID 


464 


476 


BY SIMILARITY. 




FT 


DISULFID 


506 


543 


BY SIMILARITY. 




FT 


DISULFID 


510 


548 


BY SIMILARITY. 




FT 


DISULFID 


521 


533 


BY SIMILARITY. 




FT 


DISULFID 


553 


564 


BY SIMILARITY . 




FT 


DISULFID 


558 


574 


BY SIMILARITY. 




FT 


DISULFID 


577 


588 


BY SIMILARITY. 




FT 


DISULFID 


594 


610 


BY SIMILARITY. 




FT 


DISULFID 


601 


619 


BY SIMILARITY. 





FT 


DISULFID 


622 


646 


BY 


SIMILARITY. 




FT 


DISULFID 


652 


665 


BY 


SIMILARITY. 




FT 


DISULFID 


659 


678 


BY 


SIMILARITY. 




FT 


DISULFID 


680 


691 


BY 


SIMILARITY. 




FT 


DISULFID 


707 


715 


BY 


SIMILARITY. 




FT 


DISULFID 


720 


740 


BY 


SIMILARITY. 




FT 


DISULFID 


756 


776 


BY 


SIMILARITY. 




FT 


DISULFID 


779 


799 


BY 


SIMILARITY. 




FT 


DISULFID 


815 


835 


BY 


SIMILARITY. 




FT 


DISULFID 


838 


858 


BY 


SIMILARITY. 




FT 


DISULFID 


876 


896 


BY 


SIMILARITY. 




FT 


DISULFID 


912 


932 


BY 


SIMILARITY. 




FT 


DISULFID 


948 


1169 


BY 


SIMILARITY. 




FT 


CARBOHYD 


151 


151 


N-LINKED 


(GLCNAC. . 


) ( POTENTIAL) . 


FT 


CARBOHYD 


316 


316 


N-LINKED 


(GLCNAC. . 


) (POTENTIAL) . 


FT 


CARBOHYD 


330 


330 


N-LINKED 


(GLCNAC. . 


) (POTENTIAL) . 


FT 


CARBOHYD 


457 


457 


N-LINKED 


(GLCNAC. . 


) (POTENTIAL) . 


FT 


CARBOHYD 


584 


584 


N-LINKED 


(GLCNAC. . 


) ( POTENTIAL) . 


FT 


CARBOHYD 


710 


710 


N-LINKED 


(GLCNAC. . 


) (POTENTIAL) . 


FT 


CARBOHYD 


1069 


1069 


N-LINKED 


(GLCNAC. . 


) (POTENTIAL). 


SQ 


SEQUENCE 


1172 


AA; 129911 


MW; 


7CE8E4E8599822AB 


CRC64 ; 



Query Match 6.1%; Score 293; DB 1; Length 1172; 

Best Local Similarity 38.0%; Pred. No. 4.7e-14; 

Matches 60; Conservative 22; Mismatches 66; Indels 10; 



Gaps 



Qy 



Db 



209 RQARLADT AN YT C VAKN I VARRR S - AS AAVI VYVN G GW S T WT EW S VC S AS C G RGWQ K RS R 267 

: : I I : I I : : I I I : I I I I I I : I I I I : I I I II 

4 03 QRGRSCDVTSNTCLGPSIQTRTCSLGKCDTRIRQNGGWSHWSPWSSCSVTCGVGNVTRIR 462 



Qy 

Db 

Qy 

Db 



268 SCTNPAPLNGGAFCEGQNVQKTAC-ATLCPVDGSWSPWSKWSACGLDCT HWRSRECS 323 

I : I I II I : I : I I I : I I I I I I I I I I I : I INI: 

4 63 LCNSPVPQMGGKNCKGSGRETKPCQRDPCPIDGRWSPWSPWSACTVTCAGGIRERSRVCN 522 

324 DPAPRNGGEECQG — TD LDTRNCTSDLCVHSASGP 356 

I I : I I : : I I I : : I : I I I : : I 
523 SPEPQYGGKDCVGDVTEHQMCNKRSCPIDGCLSNPCFP 560 



RESULT 5 
TSP2_BOVIN 

ID TSP2_BOVIN STANDARD; PRT; 1170 AA. 

AC Q95116; Q28180; 

DT 01-NOV-1997 (Rel. 35, Created) 

DT 15-JUL-1999 (Rel. 38, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Thrombospondin 2 precursor ( Corticotropin-induced secreted protein) 
DE (CISP) . 

GN THBS2 OR TSP2 OR TSP-2 . 
OS Bos taurus (Bovine) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
OC Mammalia; Eutheria; Cetartiodactyla; Ruminantia; Pecora; Bovoidea; 
OC Bovidae; Bovinae; Bos. 
OX NCBI_TaxID=9913; 
RN [1] 

RP SEQUENCE FROM N.A. 

RA Danik M., Chinn A., Lafeuillade M., Keramidas M. , Aguesse-Germon S., 



RA Penhoat A. , Chen H., Mosher D., Chamfoaz E.M., Feige J.J.; 

RL Submitted (MAR-1998) to the EMBL/ GenBank/DDB J databases. 

RN [2] 

RP SEQUENCE OF 1-522 FROM N.A. 

RX MEDLINE=96331130; PubMed-8698 834 ; 

RA Lafeuillade B., Pellerin S., Keramidas M. , Danik M. , Chambaz E.M., 

RA Feige J. J. ; 

RT "Opposite regulation of thrombospondin-1 and corticotropin-induced 

RT secreted protein/thrombospondin-2 expression by adrenocorticotropic 

RT hormone in adrenocortical cells."; 

RL J. Cell. Physiol. 167:164-172(1996). 

RN [3] 

RP SEQUENCE OF 318-831 FROM N.A. 

RC TISSUE=Aortic endothelium; 

RA Zafar R.S. f Moll Y . D . , Womack J.F., Walz D.A. ; 

RT "Cloning and sequencing of bovine thrombospondin stimulatory effect of 

RT TGF-beta . " ; 

RL Submitted (MAY-1995) to the EMBL/ GenBank/DDB J databases. 

CC -!- FUNCTION: Adhesive glycoprotein that mediates cell-to-cell and 

CC cell-to-matrix interactions. Can bind to fibrinogen, fibronectin, 

CC laminin and type V collagen. 

CC -!- SUBUNIT: Homotrimer; disulf ide-linked. 

CC -!- SIMILARITY: Belongs to the thrombospondin family. 

CC -!- SIMILARITY: Contains 1 VWFC domain. 

CC -!- SIMILARITY: Contains 3 EGF-like domains. 

CC -!- SIMILARITY: Contains 3 TSP type-1 domains. 

CC -!- SIMILARITY: Contains 7 TSP type-3 domains. 

CC -!- SIMILARITY: Contains 1 TSP N-terminal (TSPN) domain. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; X96540; CAA65385.1; 

DR EMBL; X87620; CAA60952.1; -. 

DR HSSP; P00740; 1EDM. 

DR InterPro; IPR001881; EGF_Ca. 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR006210; IEGF. 

DR InterPro; IPR000884; TSP1. 

DR InterPro; IPR008085; TSP_1. 

DR InterPro; IPR003367; tsp_3 . 

DR InterPro; IPR008859; TSPC. 

DR InterPro; IPR003129; TSPN. 

DR InterPro; IPR001007; VWF_C. 

DR Pfam; PF00008; EGF; 1. 

DR Pfam; PF00090; tsp_l; 3. 

DR Pfam; PF02412; tsp_3; 13. 

DR Pfam; PF05735; TSPC; 1. 

DR Pfam; PF02210; TSPN; 1. 

DR Pfam; PF00093; vwc; 1. 

DR PRINTS; PR01705; TSPlREPEAT. 

DR SMART; SM00181; EGF; 3. 



DR 


SMART; SM00209; TSP1; 3. 








DR 


SMART; SM00210; TSPN; 1. 








DR 


SMART; SM00214; VWC; 1. 








DR 


PROSITE; 


PS00022; 


EGF_1 ; 


FALSE NEG. 


DR 


PROSITE; 


PS01186; 


EGF_2 ; 


1. 






DR 


PROSITE; 


PS50026; 


EGF 3; 


2. 






DR 


PROSITE; 


PS50092; 


TSP1; 3 








DR 


PROSITE; 


PS01208; 


VWFC_1 ; 


1. 






DR 


PROSITE; 


PS50184; 


VWFC 2; 


1. 






KW 


Glycoprotein; Cell adhesion; 


Calcium-binding; Heparin-binding; Repeat; 


KW 


EGF-like 


domain ; 


Signal . 








FT 


SIGNAL 


1 


18 




POTENTIAL. 


FT 


CHAIN 


19 


1170 




THROMBOSPONDIN 2. 


FT 


DOMAIN 


19 


215 




TSF 


N-TERMINAL . 


FT 


DOMAIN 


19 


232 




HEPARIN-BINDING (POTENTIAL) . 


FT 


DOMAIN 


318 


375 




VWFC . 


FT 


DOMAIN 


379 


429 




TSF 


TYPE-1 1. 


FT 


DOMAIN 


435 


490 




TSF 


TYPE-1 2. 


FT 


DOMAIN 


492 


547 




TSP 


TYPE-1 3. 


FT 


DOMAIN 


547 


587 




EGF-LIKE 1. 


FT 


DOMAIN 


588 


645 




EGF-LIKE 2, CALCIUM- BINDING (POTENTIAL) . 


FT 


DOMAIN 


646 


690 




EGF-LIKE 3. 


FT 


DOMAIN 


723 


758 




TSP 


TYPE- 3 1. 


FT 


DOMAIN 


759 


781 




TSP 


TYPE-3 2. 


FT 


DOMAIN 


782 


817 




TSP 


TYPE-3 3. 


FT 


DOMAIN 


818 


840 




TSP 


TYPE-3 4. 


FT 


DOMAIN 


841 


878 




TSP 


TYPE-3 5. 


FT 


DOMAIN 


879 


914 




TSP 


TYPE-3 6. 


FT 


DOMAIN 


915 


950 




TSP 
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A -> V (IN REF. 3) . 
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S -> T (IN REF. 3) . 




SQ 


SEQUENCE 


1170 


AA; 129862 


MW; 9CF1FBF55B89A051 


CRC64; 


Query Match 




6.1%; 


Score 291.5; DB 1; 


Length 1170; 



Best Local Similarity 38.4%; Pred. No. 6.1e-14; 
Matches 56; Conservative 21; Mismatches 64; 



Indels 



5; Gaps 



Qy 



Db 



209 RQARIJVDTANYTCVAJ(NIVARRRSASAA-VIVTWGGWSTWTEWSVCSASCGRGWQKRSR 267 

: : I I : I I : : I I I : : I I I I I : I I I I : I I I II 

4 01 QRGRSCDVTSNTCLGPSIQTRACSLGRCDHRIRQDGGWSHWSPWSSCSVTCGVGNVTRIR 4 60 



QY 
Db 

Qy 

Db 



268 SCTNPAPLNGGAFCEGQNVQKTAC-ATLCPVDGSWSPWSKWSACGLDCT HWRSRECS 323 

I : I I II I : I : II I I I I I I I I I I I I I I : I hi h 

4 61 LCNS PVPQMGGRS CKGS GRETKACQGP PCPVDGRWS PWS PWSACT VTCAGGI RERTRVCN 520 

324 DPAPRNGGEECQGTDLDTRNCTSDLC 349 

I I : : I I : : I I : : I I 

521 SPEPQHGGKDCVGGAKEQQMCNRKSC 54 6 



RESULT 6 
SM5A_MOUSE 

ID SM5A_MOUSE STANDARD; PRT; 1077 AA. 

AC Q62217; 

DT 30-MAY-2000 (Rel. 39, Created) 

DT 30-MAY-2000 (Rel. 39, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Semaphorin 5A precursor (Semaphorin F) (Sema F) . 

GN SEMA5A OR SEMAF OR SEMF. 

OS Mus mus cuius (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=NMRI ; 

RX MEDLINE=964 14430; PubMed=88 17451 ; 

RA Adams R.H., Betz H., Pueschel A.W.; 

RT "A novel class of murine semaphorins with homology to thrombospondin 

RT is differentially expressed during early embryogenesis . " ; 

RL Mech. Dev. 57:33-45(1996). 

CC -!- FUNCTION: May act as positive axonal guidance cues. 

CC -!- SUBCELLULAR LOCATION: Type I membrane protein. 

CC -!- TISSUE SPECIFICITY: IN ADULT, DETECTED IN LIVER, BRAIN, KIDNEY, 
CC HEART, LUNG AND SPLEEN. 



cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 
cc 

DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
KW 
KW 



- DEVELOPMENTAL STAGE: DIFFERENTIALLY EXPRESSED IN EMBRYONIC AND 
ADULT TISSUES. ITS ABUNDANCE DECREASES FROM E10 TO BITH. 

- SIMILARITY: Belongs to the semaphorin family. 

- SIMILARITY: Contains 1 Sema domain. 

- SIMILARITY: Contains 7 TSP type-1 domains. 

This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
the European Bioinf ormatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib . ch) . 

EMBL; X97817; CAA66397.1; 
MGD; MGI: 107556; Sema5a. 

GO; GO: 0016021; C: integral to membrane; IDA. 

GO; GO: 0008046; F:axon guidance receptor activity; IDA. 

GO; GO:0007411; P : axon guidance; IMP. 

InterPro; IPR003659; Plexin-like. 

Plexin__repeat . 
Sema . 
TSP1. 
TSP 1. 



InterPro; 
InterPro; 
InterPro; 
InterPro; 



IPR002165; 
IPR001627; 
IPR000884; 
IPR008085; 
Pfam; PF01437; PSI; 1. 
Pfam; PF01403; Sema; 1. 
Pfam; PF00090; tsp_l; 5. 
PRINTS; PR01705; TSP1REPEAT. 
SMART; SM00423; PSI; 1. 
SMART; SM00630; Sema; 1. 
SMART; SM00209; TSP1; 6. 
PROSITE; PS50092; TSP1; 6. 

Signal; Transmembrane; Repeat; Multigene family; Neurogenesis; 
Developmental protein; Glycoprotein. 
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FT 


DOMAIN 


897 


944 


TSP TYPE-1 7. 
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SQ SEQUENCE 1077 AA; 120826 MW; EDAB0DDDA427 89FF CRC64; 



Query Match 6.1%; Score 291; DB 1; Length 1077; 

Best Local Similarity 45.8%; Pred. No. 5.9e-14; 

Matches 54; Conservative 10; Mismatches 50; Indels 4; Gaps 2; 

Qy 241 VNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACATL-CPVDG 299 

I I I I I I I I I I I 111:111111 II II::: I I Mill 
Db 783 VNGAWSAWTSWSQCSRDCSRGIRNRKRVCNNPEPKFGGMPCLGPSLEFQECNILPCPVDG 842 

Qy 300 SWSPWSKWSACGLDC THWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSAS 354 

I I I I I I I I : I : I I I : I I I I I : I I : I : I II 

Db 843 VWSCWSSWSKCSATCGGGHYMRTRSCSNPAPAYGGDICLGLHTEEALCNTQTCPESWS 900 



RESULT 7 
TSP1 XENLA 



ID TSP1_XENLA STANDARD; PRT; 1173 AA. 

AC P35448; 

DT 01-JUN-1994 (Rel. 29, Created) 

DT 01-JUN-1994 (Rel. 29, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Thrombospondin 1 precursor. 

GN THBSl OR TSPl. 

OS Xenopus laevis (African clawed frog) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Amphibia; Batrachia; Anura; Mesobatrachia ; Pipoidea; Pipidae; 

OC Xenopodinae; Xenopus. 

OX NCBI_TaxID=8355; 

RN [1] 

RP SEQUENCE FROM N.A. 

RA Urry L.A., Ramos J., Duquette M. , Desimone D.W., Lawler J.; 

RL Submitted (XXX-1993) to the EMBL/ GenBank/DDBJ databases. 

CC -!- FUNCTION: Adhesive glycoprotein that mediates cell-to-cell and 

CC cell-to-matrix interactions. Can bind to fibrinogen, fibronectin, 

CC laminin, type V collagen and integrins alpha-V/beta-1, alpha- 

CC V/beta-3 and alpha-IIb/beta-3 (By similarity) . 

CC -!- SUBUNIT: Homotrimer; disulf ide-linked. 

CC -!- SIMILARITY: Belongs to the thrombospondin family. 

CC -!- SIMILARITY: Contains 1 VWFC domain. 

CC -!- SIMILARITY: Contains 3 EGF-like domains. 

CC -!- SIMILARITY: Contains 3 TSP type-1 domains. 

CC -!- SIMILARITY: Contains 7 TSP type-3 domains. 

CC -!- SIMILARITY: Contains 1 TSP N-terminal (TSPN) domain. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; L0427 8; -; NOT_ANNOTATED_CDS . 

DR HSSP; P00740; 1EDM. 

DR InterPro; IPR001881; EGF_Ca . 

DR InterPro; IPR006209; EGF like. 



DR InterPro; IPR006210; IEGF. 

DR InterPro; IPR000884; TSP1. 

DR InterPro; IPR008085; TSP_1. 

DR InterPro; IPR003367; tsp_3. 

DR InterPro; IPR008859; TSPC. 

DR InterPro; IPR003129; TSPN. 

DR InterPro; IPR001007; VWF_C. 

DR Pfam; PF00008; EGF; 2. 

DR Pfam; PF00090; tsp_l; 3. 

DR Pfam; PF02412; tsp_3; 13. 

DR Pfam; PF05735; TSPC; 1. 

DR Pfam; PF02210; TSPN; 1. 

DR Pfam; PF00093; vwc; 1. 

DR PRINTS; PR01705; TSP1REPEAT. 

DR SMART; SM00181; EGF; 2. 

DR SMART; SM00209; TSPl; 3. 

DR SMART; SM00210; TSPN; 1. 

DR SMART; SM00214; VWC; 1. 

DR PROSITE; PS00022; EGF_1 ; FALSE_NEG . 

DR PROSITE; PS01186; EGF_2 ; 1. 

DR PROSITE; PS50026; EGF_3; 2. 

DR PROSITE; PS50092; TSPl; 3. 

DR PROSITE; PS01208; VWFC_1; 1. 

DR PROSITE; PS50184; VWFC_2 ; 1. 

KW Glycoprotein; Cell adhesion; Calcium-binding; Heparin-binding; Repeat; 
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SQ 


SEQUENCE 


1173 


AA; 130019 


MW; 


A9F036D6516C0F24 CRC64; 


Query Match 




6.1%; 


Score 290; DB 1; 


Length 1173; 



Best Local Similarity 24.2%; Pred. No. 7.9e-14; 

Matches 92; Conservative 52; Mismatches 144; Indels 92; Gaps 16; 

Qy 11 LLGIVLAAWLRGSG AQQSATVANPVPGANPDLLPHFLVEPEDVYIVKNKPVLLVCK 66 

: I I I I I I | : | | | | : : | : : : : I : | | 

Db 221 VFGTTLEAILRNKGCLSMTNSVITLDNPVNGSSPAI RTNYIGH KTKDLQAVCG 273 

Qy 67 AVPATQIFFKCNGEWVRQVDHVIERSTDGSSGLPTMEVRINVSRQQVEKVFGLEEYWCQC 12 6 

II: : : I I I : : : I I I II 
Db 274 FSCDD LSKLFAEMKGLRTL VTTLKDQVTKETEKNELIAQI 313 

Qy 127 VAWSSSGTTKSQKAYIRIARLRKNFEQ EPLAKEVSLEQGIVLPCRP 172 

I I :: : III:: : :: I I :: I I 

Db 314 V TRTPGVCLHNGVLHKNRDEWTVDSCTECTCQNSATICRKVSCP LMPCTN 363 

Qy 173 PEG 1 P P AE VE W L RN E D LVD P S L D P N VY I T RE H S L WRQ ARLAD T AN YT C 221 

I : I I : : I I I : I I : : : I I : I I 

Db 364 ATIPDGECCPRCWPSDSADDDWSPWSDWTPCS VTCGHG-IQQRGRSCDSLNNPC 416 

Qy 222 VAKNIVAR RRSASAAVIVYVNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPA 273 

I : I : I I I I I : I I I I : I I I I I I : I 

Db 417 EGSSVQTRSCQIQDCDKRFKQ DGGWSHWSPWSSCSVTCGSGQITRIRLCNSPV 4 69 

Qy 274 PLNGGAFCEGQNVQKTAC-ATLCPVDGSWSPWSKWSACGLDC THWRSRECSDPAPRN 329 

I I Ml: : I I I :: I I I I I I I : I I I I :: I I : 

Db 470 PQLNGKQCEGEGRENKPCQKDPCPINGQWGPWSLWDTCTVTCGGGMQKRERLCNNPKPQY 52 9 

Qy 330 GGEECQGTDLDTRNCTSDLC 349 

|::| I I:: I I 
Db 530 EGKDCIGEPTDSQICNKQDC 549 



RESULT 8 
SM5B_HUMAN 

ID SM5B_HUMAN STANDARD; PRT; 1093 AA. 

AC Q9P283; 

DT 10-OCT-2003 (Rel. 42, Created) 

DT 10-OCT-2003 (Rel. 42, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Semaphorin 5B precursor. 

GN SEMA5B OR KIAA1445. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N. A. 

RC TISSUE=Brain; 

RX MEDLINE=20277482; PubMed=10819331 ; 

RA Nagase T., Kikuno R., Ishikawa K.-I., Hirosawa M. , Ohara O.; 

RT "Prediction of the coding sequences of unidentified human genes. XVII. 

RT The complete sequences of 100 new cDNA clones from brain which code 

RT for large proteins in vitro."; 

RL DNA Res. 7:143-150(2000). 

CC -!- FUNCTION: May act as positive axonal guidance cues (By 
CC similarity) . 

CC -!- SUBCELLULAR LOCATION: Type I membrane protein. 

CC -!- SIMILARITY: Belongs to the semaphorin family. 

CC -!- SIMILARITY: Contains 1 Sema domain. 

CC -!- SIMILARITY: Contains 7 TSP type-1 domains. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; AB040878; BAA95969.1; ALT_INIT. 

DR Genew; HGNC: 10737; SEMA5B. 

DR InterPro; IPR003659; Plexin-like. 

DR InterPro; IPR002165; Plexin_repeat . 

DR InterPro; IPR001627; Sema. 

DR InterPro; IPR000884; TSPl. 

DR InterPro; IPR008085; TSP_1. 

DR Pfam; PF01437; PSI; 1. 

DR Pfam; PF014 03; Sema; 1. 

DR Pfam; PF00090; tsp__l; 5. 

DR PRINTS; PR01705; TSP1REPEAT. 

DR SMART; SM00423; PSI; 1. 

DR SMART; SM00630; Sema; 1. 

DR SMART; SM00209; TSPl; 4. 

DR PROSITE; PS50092; TSPl; 5. 

KW Signal; Transmembrane; Repeat; Multigene family; Neurogenesis; 

KW Developmental protein; Glycoprotein. 

FT SIGNAL 1 2 6 POTENTIAL. 

FT CHAIN 27 1093 SEMAPHORIN 5B. 
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DOMAIN 


236 


518 


SEMA. 






FT 


DOMAIN 


551 


605 


TSP TYPE- 


1 1. 




FT 


DOMAIN 


606 


662 


TSP TYPE- 


1 2. 




FT 


DOMAIN 


664 


713 


TSP TYPE- 


1 3. 




FT 


DOMAIN 


721 


776 


TSP TYPE- 


1 4. 




FT 


DOMAIN 


795 


850 


TSP TYPE- 


1 5. 




FT 


DOMAIN 


852 


907 


TSP TYPE- 


1 6. 




FT 


DOMAIN 


908 


952 


TSP TYPE- 


1 7. 




FT 


CARBOHYD 


59 


59 


N-LINKED 


( GLCNAC . . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


95 


95 


N-LINKED 


( GLCNAC . . . ) 


(POTENTIAL) . 


FT 


CARBOHYD 


157 


157 


N-LINKED 


(GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


178 


178 


N-LINKED 


(GLCNAC. . . ) 


(POTENTIAL) . 


FT 


CARBOHYD 


287 


287 


N-LINKED 


(GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


333 


333 


N-LINKED 


(GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


378 


378 


N-LINKED 


(GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


532 


532 


N-LINKED 


(GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


539 


539 


N-LINKED 


( GLCNAC . . . ) 


(POTENTIAL) . 


FT 


CARBOHYD 


547 


547 


N-LINKED 


(GLCNAC. . . ) 


(POTENTIAL) . 


FT 


CARBOHYD 


602 


602 


N-LINKED 


(GLCNAC. . . ) 


( POTENTIAL) . 


FT 


CARBOHYD 


728 


728 


N-LINKED 


(GLCNAC. . .) 


(POTENTIAL) . 


FT 


CARBOHYD 


944 


944 


N-LINKED 


( GLCNAC . . . ) 


(POTENTIAL) . 


SQ 


SEQUENCE 


1093 


AA; 119866 


MW; F1FDEFB87CEAF0EF CRC64; 



Query Match 5.8%; Score 276; DB 1; Length 1093; 

Best Local Similarity 31.0%; Pred. No. 8.4e-13; 

Matches 72; Conservative 35; Mismatches 79; Indels 46; Gaps 



11; 



Qy 



Db 



241 VNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTACAT-LCPVDG 299 

I I I I I I I I I I I I I I I : I : I I I I : I I I I I I : : : III II 
851 VRGAWSCWTSWSPCSASCGGGHYQRTRSCTSPAPSPGEDICLGLHTEEALCATQACP— E 908 



Qy 300 SWSPWSKWSACGLDCTHWRSRECSDPAPRNGGEECQGTDLDTRNCT-SDL-CVHSASGPE 357 

I I I I I : I I I I INI: I I II : I I I : : : I I I 

Db 909 GWSPWSEWSKCTDDGAQSRSRHCEELLP— GSSACAGNSSQSRPCPYSEIPVILPASSME 966 

Qy 358 DVALYVG LIAVAVCLVL LLLVLILVYCR— KKEGLDSDVADSSILTSGFQPV 407 

: I I : I : I I I : : : I : : : : : I : 
Db 967 EATGCAGFNLIHLVATGISCFLGSGLLTLAVYLSCQHCQRQSQESTL 1013 

Qy 408 SIKPSKADNPHLLTIQPDLSTTTTTYQGSLCPRQDGPSP-KFQLTNGHLLSP 458 

: I : : : I I : I I : : : I : I : I : I I 

Db 1014 - VHPAT PNHLH YKGGGT P KN E K YT PME FKT LNKNN LIP 1050 



RESULT 9 
SM5B_MOUSE 

ID SM5B_MOUSE STANDARD; PRT; 1093 AA. 

AC Q60519; 

DT 30-MAY-2000 (Rel. 39, Created) 

DT 30-MAY-2000 (Rel. 39, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Semaphorin 5B precursor (Semaphorin G) (Sema G) . 

GN SEMA5B OR SEMAG OR SEMG. 

OS Mus musculus (Mouse) . 



oc 
oc 
ox 

RN 
RP 
RC 
RX 
RA 
RT 
RT 
RL 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
KW 
KW 



Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus . 
NCBI_TaxID=10090; 
[1] 

SEQUENCE FROM N.A. 
STRAIN-NMRI ; 

MEDLINE=964 14430; PubMed=8 817451 ; 
Adams R.H., Betz H., Pueschel A.W.; 

"A novel class of murine semaphorins with homology to thrombospondin 
is differentially expressed during early embryogenesis . " ; 
Mech. Dev. 57:33-45(1996). 



_ | 
_ i 
- i 



— | 
_ i 



FUNCTION: May act as positive axonal guidance cues. 
SUBCELLULAR LOCATION: Type I membrane protein. 
TISSUE SPECIFICITY: In adult, only detected in brain. 
!- DEVELOPMENTAL STAGE: Differentially expressed in embryonic and 
adult tissues. Its abundance decreases from E10 to birth. 
SIMILARITY: Belongs to the semaphorin family. 
SIMILARITY: Contains 1 Sema domain. 
SIMILARITY: Contains 7 TSP type-1 domains. 



This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
the European Bioinf ormatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib . ch) . 

EMBL; X97818; CAA66398.1; -. 
MGD; MGI : 107555 ; Sema5b. 
InterPro; IPR003659; Plexin-like. 
IPR002165; 
IPR001627; 
IPR000884; TSP1. 
IPR008085; TSP_1. 
Pfam; PF01437; PSI; 1. 
Pfam; PF01403; Sema; 1. 
Pfam; PF00090; tsp_l; 5. 
PRINTS; PR01705; TSP1REPEAT. 
SMART; SM00423; PSI; 1. 
SMART; SM00630; Sema; 1. 
SMART; SM00209; TSP1; 4. 
PROSITE; PS50092; TSP1; 5. 

Signal; Transmembrane; Repeat; Multigene family; Neurogenesis; 
Developmental protein; Glycoprotein. 



InterPro; 
InterPro; 
InterPro; 
InterPro; 



Plexin_repeat . 
Sema . 



FT 


SIGNAL 


1 


19 


POTENTIAL. 


FT 


CHAIN 


20 


1093 


SEMAPHORIN 5B. 


FT 


DOMAIN 


20 


978 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


979 


999 


POTENTIAL. 


FT 


DOMAIN 


1000 


1093 


CYTOPLASMIC (POTENTIAL). 


FT 


DOMAIN 


236 


518 


SEMA. 


FT 


DOMAIN 


551 


605 


TSP TYPE-1 1. 


FT 


DOMAIN 


606 


662 


TSP TYPE-1 2. 


FT 


DOMAIN 


664 


713 


TSP TYPE-1 3. 


FT 


DOMAIN 


721 


776 


TSP TYPE-1 4. 


FT 


DOMAIN 


795 


850 


TSP TYPE-1 5. 


FT 


DOMAIN 


852 


907 


TSP TYPE-1 6. 



FT 


DOMAIN 


908 


952 


TSP TYPE- 


1 7. 




FT 


CARBOHYD 


59 


59 


N-LINKED 


(GLCNAC. . 


. ) (POTENTIAL). 


FT 


CARBOHYD 


95 


95 


N-LINKED 


(GLCNAC. . 


. ) (POTENTIAL). 


FT 


CARBOHYD 


157 


157 


N-LINKED 


(GLCNAC. . 


. ) (POTENTIAL). 


FT 


CARBOHYD 


178 


178 


N-LINKED 


(GLCNAC. . 


. ) (POTENTIAL) . 


FT 


CARBOHYD 


287 


287 


N-LINKED 


(GLCNAC. . 


. ) (POTENTIAL) . 


FT 


CARBOHYD 


333 


333 


N-LINKED 


( GLCNAC . . 


. ) (POTENTIAL) . 


FT 


CARBOHYD 


378 


378 


N-LINKED 


(GLCNAC. . 


. ) (POTENTIAL). 


FT 


CARBOHYD 


532 


532 


N-LINKED 


(GLCNAC. . 


. ) (POTENTIAL) . 


FT 


CARBOHYD 


539 


539 


N-LINKED 


(GLCNAC. . 


. ) (POTENTIAL) . 


FT 


CARBOHYD 


547 


547 


N-LINKED 


(GLCNAC. . 


. ) (POTENTIAL) . 


FT 


CARBOHYD 


602 


602 


N-LINKED 


(GLCNAC. . 


. ) (POTENTIAL). 


FT 


CARBOHYD 


728 


728 


N-LINKED 


(GLCNAC. . 


. ) (POTENTIAL). 


FT 


CARBOHYD 


944 


944 


N-LINKED 


(GLCNAC. . 


. ) (POTENTIAL) . 


SQ 


SEQUENCE 


1093 AA; 


120326 


MW; 29E5C9B1E8 108717 


CRC64; 


Query Match 




5.8%; 


Score 275 


.5; DB 1; 


Length 1093; 


Best Local Similarity 


32.1%; 


Pred. No. 


9.2e-13; 




Matches 69; 


Conservative 


18; Mismatches 75; 


Indels 53; 



Qy 163 EQGIVLPCRPPEGIPPAEVEWLRNEDLVDPSLDPNVYITREHSLWRQARLADTANYTCV 222 

II III III I I : I I I I 
Db 737 EQRFRFTCRAP LPDP HGLQFGKRR TETRTCP 767 

Qy 223 AKNIVA RRRS AS AAVI VYVNGGW S TWT EWS VC S AS CGRGWQKRS RS CTN 271 

I I I II : I I I I : I I I I I I I I : : I I : I I I 
Db 7 68 ADGTGACDTDALVEDLLRSGSTSPHTL NGGWATWGPWSSCSRDCELGFRVRKRTCTN 824 

Qy 272 PAPLNGGAFCEGQNVQKTAC-ATLCPVDGSWSPWSKWSACGLDC THWRSRECSDPAP 327 

I I I I I II : I I I I I : I I I : I I I I : I : I I : I I I 

Db 825 PEPRNGGLPCVGDAAEYQDCNPQACPVRGAWSCWTAWSQCSASCGGGHYQRTRSCTSPAP 884 

Qy 32 8 RNGGEECQGTDLDTRNCTSDLCVHSASGPEDVALY 362 

I : I I : I :: I I I : I : 

Db 885 SPGEDICLGLHTEEALCSTQAC PEGWSLW 913 



RESULT 10 
BAI 3_HUMAN 

ID BAI 3_HUMAN STANDARD; PRT; 1522 AA. 

AC 060242; 060297; 

DT 16-OCT-2001 (Rel. 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Brain-specific angiogenesis inhibitor 3 precursor. 

GN BAI3 OR KIAA0550. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Fetal brain; 

RX MEDLINE=98194217; PubMed-9533023 ; 

RA Shiratsuchi T. r Nishimori H., Ichise H., Nakamura Y. , Tokino T.; 

RT "Cloning and characterization of BAI2 and BAI 3, novel genes homologous 

RT to brain-specific angiogenesis inhibitor 1 (BAI1)."; 



RL 
RN 
RP 
RC 
RX 
RA 
RA 
RT 
RT 
RT 
RL 
RN 
RP 
RX 
RA 
RT 
RT 
RL 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 



Tanaka A. , Kotani H. 



_ i 

- i 



Cytogenet. Cell Genet. 79:103-108(1997) 
[2] 

SEQUENCE FROM N.A. 
TISSUE=Brain; 

MEDLINE-982 90545; PubMed=9628581; 
Nagase T., Ishikawa K.-I., Miyajima N., 
Nomura N., Ohara O. ; 

"Prediction of the coding sequences of unidentified human genes. IX. 
The complete sequences of 100 new cDNA clones from brain which can 
code for large proteins in vitro."; 
DNA Res. 5:31-39(1998) . 
[3] 

SEQUENCE FROM N.A., AND REVISIONS TO 643-665 AND C-TERMINUS. 
MEDLINE=22158633; PubMed=12168 954 ; 

Nakajima D., Okazaki N., Yamakawa H., Kikuno R. , Ohara O., Nagase T. ; 
"Construction of expression-ready cDNA clones for KIM genes: manual 
curation of 330 KIAA cDNA clones."; 
DNA Res. 9:99-106(2002). 

-!- FUNCTION: MIGHT BE INVOLVED IN ANGIOGENESIS INHIBITION AND 
SUPPRESSION. OF GLIOBLASTOMA. 

SUBCELLULAR LOCATION: Integral membrane protein. 

TISSUE SPECIFICITY: STRONGLY EXPRESSED IN BRAIN. ALSO DETECTED IN 
HEART. REDUCED EXPRESSION IS OBSERVED IN SOME GLIOBLASTOMA CELL 
LINES. 

SIMILARITY: Belongs to family 2 of G-protein coupled receptors. 
SIMILARITY: Contains 1 CUB domain. 
SIMILARITY: Contains 4 TSP type-1 domains. 
SIMILARITY: Contains 1 GPS domain. 

This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
the European Bioinf ormatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib.ch). 

EMBL; AB005299; BAA25363.1; -. 

EMBL; AB011122; BAA25476.2; ALT_INIT. 

PIR; T00028; T00028. 

Genew; HGNC:945; BAI3. 

MIM; 602684; -. 

InterPro; IPR000859; CUB. 

InterPro; IPR000832; GPCR_secretin . 

InterPro; IPR001879; hormn_receptor . 

InterPro; IPR000203; PKD_cys_rich . 

InterPro; IPR000884; TSPl. 

Pfam; PF00002; 7tm_2; 1. 

Pfam; PF01825; GPS; 1. 

Pfam; PF02793; HRM; 1. 

Pfam; PF00090; tsp_l; 4. 

SMART; SM00303; GPS; 1. 

SMART; SM00008; HormR; 1. 

SMART; SM00209; TSPl; 4. 

PROSITE; PS01180; CUB; 1. 

PROSITE; PS50221; GPS; 1. 

PROSITE; PS00649; G_PROTEIN RECEP F2 1; FALSE NEG. 



DR 


PROSITE; 


PS00650; 


G PROTEIN 


RECEP F2 2; FALSE NEG. 


DR 


PROSITE; 


PS50227; 


G_PROTEIN~ 


_RECEP_F2_3; 1. 


DR 


PROSITE; 


PS50261; 


G PROTEIN" 


~RECEP_F2_4 ; 1 . 


DR 


PROSITE; 


PS50092; 


TSPl; 


4. 




KW 


G-protein 


coupled receptor; 


Transmembrane; Glycoprotein; Signal; 


KW 


Repeat . 










FT 


SIGNAL 


1 


24 




POTENTIAL. 


FT 


CHAIN 


25 


1522 




BRAIN-SPECIFIC ANGIOGENESIS INHIBITOR 3. 


FT 
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25 
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EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


881 


901 




1 (POTENTIAL) . 


FT 


DOMAIN 


902 


910 




CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


911 


931 




2 (POTENTIAL). 


FT 


DOMAIN 


932 


939 




EXTRACELLULAR ( POTENTIAL) . 


FT 


TRANSMEM 


940 


960 




3 (POTENTIAL) . 


FT 


DOMAIN 


961 


981 




CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


982 


1002 




4 (POTENTIAL) . 


FT 


DOMAIN 


1003 


1023 




EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


1024 


1044 




5 (POTENTIAL) . 


FT 


DOMAIN 


1045 


1098 




CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


1099 


1119 




6 (POTENTIAL) . 


FT 


DOMAIN 


1120 


1125 




EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


1126 


1146 




7 (POTENTIAL) . 


FT 


DOMAIN 


1147 


1522 




CYTOPLASMIC (POTENTIAL) . 


FT 


DOMAIN 


30 


159 




CUB. 


FT 


DOMAIN 


291 


343 




TSP TYPE-1 1. 


FT 


DOMAIN 


345 


398 




TSP TYPE-1 2. 


FT 


DOMAIN 


400 


453 




TSP TYPE-1 3. 


FT 


DOMAIN 


455 


508 




TSP TYPE-1 4. 


FT 


DOMAIN 


816 


868 




GPS. 


FT 


DOMAIN 


942 


945 




POLY-THR. 


FT 


DOMAIN 


1173 


1176 




POLY-SER. 


FT 


CARBOHYD 


51 


51 




N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


54 


54 




N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


82 


82 




N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


105 


105 




N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


241 


241 




N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


337 


337 




N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


418 


418 




N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


540 


540 




N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


625 


625 




N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


779 


779 




N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


812 


812 




N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


828 


828 




N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


937 


937 




N-LINKED (GLCNAC. . .) (POTENTIAL) . 


SQ 


SEQUENCE 


1522 AA; 171490 


MW; D22D0A5D4BB62502 CRC64; 


Query Match 




5 


.7%; 


Score 275; DB 1; Length 1522; 



Best Local Similarity 39.0%; Pred. No. 1.6e-12; 

Matches 57; Conservative 20; Mismatches 53; Indels 16; Gaps 6 

Qy 220 TCVA KNIVARRRSASAAVIVTWGGWSTWTEWSVCSASCGRGWQKRSRSCTNPA 273 

III: : I : : I : I I I : I I : I I : I I I I : I : I I I I 

Db 317 T CVS PYGTHCSGPLRES RVCNNT AL C P VH G VW E EW SPWSLCSFTCG RGQ RT RT RS C T — P 374 

Qy 274 PLNGGAFCEGQNVQKTAC-ATLCPVDGSWSPWSKWSACGLDC THWRSRECSDPAPRN 329 

I II I I I I I I I I I I I I I I I I : I I I I I : I : I : 

Db 375 PQYGGRPCEGPETHHKPCNIALCPVDGQWQEWSSWSQCSVTCSNGTQQRSRQCT— AAAH 432 



Qy 330 GGEECQGTDLDTRNCTSDLCVHSASG 355 

I I I I : I :: I I : I : I : I 
Db 433 GGSECRGPWAESRECYNPEC--TANG 456 

RESULT 11 
BAI2_HUMAN 

ID BAI 2__HUMAN STANDARD; PRT; 1572 AA. 

AC 060241; 

DT 16-OCT-2001 (Rel. 40, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Brain-specific angiogenesis inhibitor 2 precursor. 

GN BAI 2. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Fetal brain; 

RX MEDLINE=98194217; PubMed=9533023 ; 

RA Shiratsuchi T., Nishimori H., Ichise H., Nakamura Y. , Tokino T. ; 

RT "Cloning and characterization of BAI2 and BAI 3, novel genes homologous 

RT to brain-specific angiogenesis inhibitor 1 (BAI1)."; 

RL Cytogenet. Cell Genet. 79:103-108(1997). 

CC -!- FUNCTION: MIGHT BE INVOLVED IN ANGIOGENESIS INHIBITION. 

CC -!- SUBCELLULAR LOCATION: Integral membrane protein. 

CC -!- TISSUE SPECIFICITY: STRONGLY EXPRESSED IN BRAIN. ALSO DETECTED IN 

CC HEART, THYMUS, SKELETAL MUSCLE, AND DIFFERENT CELL LINES. 

CC -!- SIMILARITY: Belongs to family 2 of G-protein coupled receptors. 

CC -!- SIMILARITY: Contains 4 TSP type-1 domains. 

CC -!- SIMILARITY: Contains 1 GPS domain. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; AB005298; BAA25362.1; -. 

DR PIR; T00027; T00027. 

DR Genew; HGNC:944; BAI 2. 

DR MIM; 602683; -. 

DR InterPro; IPR000832; GPCR_secretin . 

DR InterPro; IPR001879; hormn_receptor. 

DR InterPro; IPR000203; PKD_cys_rich . 

DR InterPro; IPR000884; TSP1. 

DR Pfam; PF00002; 7tm_2; 1. 

DR Pfam; PF01825; GPS; 1. 

DR Pfam; PF02793; HRM; 1. 

DR Pfam; PF00090; tsp_l; 4. 

DR SMART; SM00303; GPS; 1. 

DR SMART; SM00008; HormR; 1. 



DR 


SMART; SM00209; TSP1; 4. 




DR 


PROSITE; 


PS50221; 


GPS; 1. 




DR 


PROSITE; 


PS00649; 


G_PROTEIN_ 


RECEP F2 1; FALSE_NEG. 


DR 


PROSITE; 


PS00650; 


g_protein" 


_RECEP_F2_2; FALSE_JtfEG. 


DR 


PROSITE; 


PS50227; 


g_protein~ 


_RECEP_F2_3 ; 1 . 


DR 


PROSITE; 


PS50261; 


G PROTEIN" 


~RECEP_F2_4 ; 1 . 


DR 


PROSITE; 


PS50092; 


TSP1; 4. 




KW 


G-protein coupled 


receptor; 


Transmembrane; Glycoprotein; Signal; 


KW 


Repeat . 








FT 


SIGNAL 


1 


20 


POTENTIAL. 


FT 


CHAIN 


21 


1572 


BRAIN-SPECIFIC ANGIOGENESIS INHIBITOR 2. 


FT 


DOMAIN 


21 


924 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


925 


945 


1 (POTENTIAL) . 


FT 


DOMAIN 


946 


953 


CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


954 


974 


2 (POTENTIAL) . 


FT 


DOMAIN 


975 


982 


EXTRACELLULAR ( POTENTIAL) . 


FT 


TRANSMEM 


983 


1003 


3 (POTENTIAL) . 


FT 


DOMAIN 


1004 


1024 


CYTOPLASMIC (POTENTIAL) . 


FT 


TRANSMEM 


1025 


1045 


4 (POTENTIAL) . 


FT 


DOMAIN 


1046 


1066 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


1067 


1087 


5 (POTENTIAL) . 


FT 


DOMAIN 


1088 


1141 


CYTOPLASMIC (POTENTIAL). 


FT 


TRANSMEM 


1142 


1162 


6 (POTENTIAL) . 


FT 


DOMAIN 


1163 


1168 


EXTRACELLULAR (POTENTIAL) . 


FT 


TRANSMEM 


1169 


1189 


7 (POTENTIAL) . 


FT 


DOMAIN 


1190 


1572 


CYTOPLASMIC (POTENTIAL) . 


FT 


DOMAIN 


297 


350 


TSP TYPE-1 1. 


FT 


DOMAIN 


352 


405 


TSP TYPE-1 2. 


FT 


DOMAIN 


407 


460 


TSP TYPE-1 3. 


FT 


DOMAIN 


463 


516 


TSP TYPE-1 4. 


FT 


DOMAIN 


859 


911 


GPS. 


FT 


DOMAIN 


117 


122 


POLY-GLU. 


FT 


DOMAIN 


177 


180 


POLY-ASN . 


FT 


DOMAIN 


222 


225 


POLY-THR. 


FT 


DOMAIN 


1303 


1306 


POLY- PRO. 


FT 


DOMAIN 


1352 


1358 


POLY-GLY. 


FT 


DOMAIN 


1413 


1418 


POLY- PRO. 


FT 


CARBOHYD 


94 


94 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


179 


179 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


180 


180 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


344 


344 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


425 


425 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


548 


548 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


633 


633 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


FT 


CARBOHYD 


855 


855 


N-LINKED (GLCNAC. . .) (POTENTIAL). 


SQ 


SEQUENCE 


1572 AA; 171140 


MW; A9775645B77BC285 CRC64; 



Query Match 5.7%; Score 274.5; DB 1; Length 1572; 

Best Local Similarity 19.2%; Pred. No. 1.8e-12; 

Matches 176; Conservative 108; Mismatches 307; Indels 327; Gaps 38; 

Qy 173 PEGIPPAEVEWLRNEDLVDPSLDPNVY 1 T REH S L WRQARL 213 

I I I : : I I : I : I : I : I I I I I 

Db 271 PEEEPKVKTQWPRSAD EPGLYMAQTGDPAAEEWSPWSVCSLTCGQGLQVR-TRS 323 

Qy 214 ADTANYTCVAKNIVARRRSASAAVIWVNGGWSTWTEWSVCSASCGRGWQKRSRSCTNPA 273 

: : I : : I : : I : I I I I I : I I I I I I I : I I : I 



Db 



324 CVSSPYGTLCSGPLRETRPCNNSATCPVHGVWEEWGSWSLCSRSCGRGSRSRMRTCV — P 381 



Qy 274 PLNGGAFCEGQNVQKTACA-TLCPVDGSWSPWSKWSACGLDC THWRSRECSDPAPR- 32 8 

1:11 Ml : I I : I I I : I I I I I I I I I I : I I I 
Db 382 PQHGGKACEGPELQTKLCSMAACPVEGQWLEWGPWGPCSTSCANGTQQRSRKCSVAGPAW 441 

Qy 32 9 NGGEECQ 335 

I I : 

Db 442 ATCTGALTDTRECSNLECPATDSKWGPWNAWSLCSKTCDTGWQRRFRMCQATGTQGYPCE 501 

Qy 336 GTDLDTRNCTSDLC — VHSASGPEDVAL 361 

I I : : I : I I III 
Db 502 GTGEEVKPCS EKRCPAFHEMCRDE YVMLMTWKKAAAGE 1 1 YNKCP PNAS GS AS RRCLLSA 561 

Qy 362 YVGL I AVAVC L VLLLLVLILVYCRKKEGLDSDVADSSILTSGFQPVSIKPSKA 414 

I I I : I I : I : : : I : : : I : : I I : : : 

Db 562 QGVAYWGLPSFARCISHEYRYLYLSLREHLAKGQRMLAGEGMSQWRS-LQELLARRTYY 620 

Qy 415 DNPHLLTIQPDLSTTTTTYQGSLCPRQDGPSPKFQLT NGHLLSPLGG 4 61 

I :: : | | : : | | I I : : : | | | 

Db 621 SGDLLFSWILRNVTDTFKRATYVPS7VDDVQRFFQWSFMVDAENKEKWDDAQQVSP — G 678 

Qy 462 GRHTLHHSSPTSEAEEFV SRLSTQNYFRSLPRG TSNMTYGTFN 504 

II I : I : I : I I I : I : I : : I : 

Db 679 SVHLLR WEDFIHLVGDALKAFQSSLIVTDNLVISIQREPVSAVSSDITFPMRG 732 

Qy 505 FLG GRLMI PNTGI SLLI P PDAIPRGK 530 

I | | : | : | | | I : I I 

Db 733 RRGMKDWVRHSEDRLFLPKEVLSLSSPGKPATSGAAGSPGRGRGPGTVPPGPGHSHQRLL 792 

Qy 531 IYE-IYLTLHKPEDVRLPLAGCQTLLSPIVSCGPPGVLLTRPVIL 574 

:| : I I I I Ml ::: |: || |:| 

Db 793 PADPDESSYFVTGAVLYRTLGLILPPP RPPLAVTSRVMT — VTVRPPTQPPAEPLIT 847 

Qy 575 A MDHCGEPSPDSWSLRLKKQSCEGSWEDVLHLGEEAPSHLYYCQ-LEASACYV — 626 

: I II : : I | : Mill: 
Db 84 8 VELSYIINGTTDPHCASWDYS-RADASSGDWD TENCQTLETQAAHTRC 894 

Qy 627 FTEQLGRFALVGE ALSVAAAKRLKLLLFAPVACTSLEYNIRVYCLHDTHDALKEV 681 

: I II::: I : I : : I : : I : I : I : : I I 
Db 895 QCQHLSTFAVLAQPPKDLTLELAGSPSVPLVIGCAVSCMALLTLLAIYA AFWRF 94 8 

Qy 682 VQLEKQLGGQLIQEPRVLHFKDSYHNLRLSIHDVPSSLWKSKLLVSYQEIPFYHIWNGTQ 741 

: : I : : I I I I : I : : : I I : : I 

Db 949 IKSERSI ILLNFCLSI — LASNI L I LVGQ S RVL S KGVCTMT A 988 

Qy 742 RYLHCTFTLERVSPSTSDLACKLWV WQVEGDG 773 

: I I I I : I I I : 

Db 989 AFLHFFF LS S FCWVLTEAWQS YLAVI GRMRTRLVRKRFLCLGWGLPALV 1037 

Qy 774 QSFSINFNITKDTRFAELLALESEAG-VPALVGPSA FKIPFLIRQKI IS 821 

: I : I I I : I I I : I I I I : I I : : I : II 

Db 1038 VAVSVGFTRTKGYGTSSYCWLSLEGGLLYAFVGPAAVIVLVNMLIGIIVFNKLMARDGIS 1097 

Qy 822 SLDPPCRRGAD WRTL 836 

II:: 1:1 
Db 1098 DKSKKQRAGSERCPWASL 1115 



RESULT 12 
TSP1 MOUSE 



ID TSPl_MOUSE STANDARD; PRT; 1170 AA. 

AC P35441; 

DT 01-JUN-1994 (Rel. 29, Created) 

DT 01-JUN-1994 (Rel. 29, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 

DE Thrombospondin 1 precursor. 

GN THBSl OR TSPl. 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Rodentia; Sciurognathi ; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=92128941; PubMed-1774063 ; 

RA Lawler J. , Duquette M. , Ferro P., Copeland N.G., Gilbert D.J., 

RA Jenkins N.A. ; 

RT "Characterization of the murine thrombospondin gene."; 

RL Genomics 11:587-600(1991). 

RN [2] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=92147683; PubMed=1371115 ; 

RA Laherty CD., O'Rourke K., Wolf F.W. , Katz R., Seldin M.F., 

RA Dixit V.M. ; 

RT "Characterization of mouse thrombospondin 2 sequence and expression 

RT during cell growth and development."; 

RL J. Biol. Chem. 267:3274-3281(1992). 

RN [3] 

RP SEQUENCE OF 1-4 90 FROM N.A. 

RX MEDLINE=90375546; PubMed=2398070 ; 

RA Bornstein P., Alfi D., Devarayalu S., Framson P . , Li P.; 

RT "Characterization of the mouse thrombospondin gene and evaluation of 

RT the role of the first intron in human gene expression."; 

RL J. Biol. Chem. 265:16691-16698(1990). 

CC -!- FUNCTION: Adhesive glycoprotein that mediates cell-to-cell and 

CC cell-to-matrix interactions. Can bind to fibrinogen, fibronectin, 

CC laminin, type V collagen and integrins alpha-V/beta-1, alpha- 

CC V/beta-3 and alpha-IIb/beta-3 . 

CC -!- SUBUNIT: Homotrimer; disulf ide-linked. 

CC -!- SIMILARITY: Belongs to the thrombospondin family. 

CC -!- SIMILARITY: Contains 1 VWFC domain. 

CC -!- SIMILARITY: Contains 3 EGF-like domains. 

CC -!- SIMILARITY: Contains 3 TSP type-1 domains. 

CC -!- SIMILARITY: Contains 7 TSP type-3 domains. 

CC -!- SIMILARITY: Contains 1 TSP N-terminal (TSPN) domain. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib.ch). 

CC 



DR EMBL; M62470; AAA50611.1; -. 

DR EMBL; M62450; AAA50611.1; JOINED . 

DR EMBL; M62451; AAA50611.1; JOINED. 

DR EMBL; M62452; AAA50611.1; JOINED. 

DR EMBL; M62453; AAA50611.1; JOINED. 

DR EMBL; M62454; AAA50611.1; JOINED. 

DR EMBL; M62455; AAA5061I.1; JOINED. 

DR EMBL; M62456; AAA50611.1; JOINED. 

DR EMBL; M62457; AAA50611.1; JOINED. 

DR EMBL; M62458; AAA50611.1; JOINED. 

DR EMBL; M62459; AAA50611.1; JOINED. 

DR EMBL; M62460; AAA50611.1; JOINED. 

DR EMBL; M62461; AAA50611.1; JOINED. 

DR EMBL; M62462; AAA50611.1; JOINED. 

DR EMBL; M62463; AAA50611.1; JOINED. 

DR EMBL; M62464; AAA50611.1; JOINED. 

DR EMBL; M62465; AAA50611.1; JOINED. 

DR EMBL; M62466; AAA50611.1; JOINED. 

DR EMBL; M62467; AAA50611.1; JOINED. 

DR EMBL; M624 68; AAA50611.1; JOINED. 

DR EMBL; M62469; AAA50611.1; JOINED. 

DR EMBL; M87276; AAA53063.1; -. 

DR EMBL; J05606; AAA40431.1; -. 

DR EMBL; J05605; AAA40431.1; JOINED. 

DR PIR; A40558; A40558. 

DR MGD; MGI : 98737; Thbsl. 

DR GO; GO: 0005615; C : extracellular space; IDA. 

DR GO; GO: 0016525; P:negative regulation of angiogenesis ; IDA. 

DR InterPro; IPR001881; EGF_Ca. 

DR InterPro; IPR006209; EGF_like. 

DR InterPro; IPR006210; IEGF. 

DR InterPro; IPR000884; TSP1. 

DR InterPro; IPR008085; TSP_1. 

DR InterPro; IPR003367; tsp_3. 

DR InterPro; IPR008859; TSPC. 

DR InterPro; IPR003129; TSPN. 

DR InterPro; IPR001007; VWF_C. 

DR Pfam; PF00008; EGF; 2. 

DR Pfam; PF00090; tsp_l; 3. 

DR Pfam; PF02412; tsp_3; 13. 

DR Pfam; PF05735; TSPC; 1. 

DR Pfam; PF02210; TSPN; 1. 

DR Pfam; PF00093; vwc; 1. 

DR PRINTS; PR01705; TSP1REPEAT. 

DR SMART ; SM00181; EGF; 3. 

DR SMART; SM00209; TSP1; 3. 

DR SMART; SM00210; TSPN; 1. 

DR SMART; SM00214; VWC; 1. 

DR PROSITE; PS00022; EGF_1; FALSE_NEG. 

DR PROSITE; PS01186; EGF_2 ; 1. 

DR PROSITE; PS50026; EGF_3; 2. 

DR PROSITE; PS50092; TSP1; 3. 

DR PROSITE; PS01208; VWFC_1; 1. 

DR PROSITE; PS50184; VWFC_2; 1. 

KW Glycoprotein; Cell adhesion; Calcium-binding; Heparin-binding; Repeat; 

KW EGF-like domain; Signal. 

FT SIGNAL 1 18 POTENTIAL. 



FT 


CHAIN 


19 


1170 


THROMBOSPONDIN 1. 




FT 


DOMAIN 


19 


232 


HEPARIN-BINDING (POTENTIAL). 


FT 


DOMAIN 


24 


221 


TSP N-TERMINAL. 




FT 


DOMAIN 


316 


373 


VWFC. 




FT 


DOMAIN 


379 


429 


TSP TYPE-1 1. 




FT 


DOMAIN 


435 


490 


TSP TYPE-1 2. 




FT 


DOMAIN 


492 


547 


TSP TYPE-1 3. 




FT 


DOMAIN 


549 


587 


EGF-LIKE 1. 




FT 


DOMAIN 


588 


645 


EGF-LIKE 2, CALCIUM-BINDING (POTENTIAL) 


FT 


DOMAIN 


646 


690 


EGF-LIKE 3. 




FT 


DOMAIN 


723 


758 


TSP TYPE-3 1. 




FT 


DOMAIN 


759 


781 


TSP TYPE-3 2. 




FT 


DOMAIN 


782 


817 


TSP TYPE-3 3. 




FT 


DOMAIN 


818 


840 


TSP TYPE-3 4. 




FT 


DOMAIN 


841 


878 


TSP TYPE-3 5. 




FT 


DOMAIN 


879 


914 


TSP TYPE-3 6. 




FT 


DOMAIN 


915 


950 


TSP TYPE-3 7. 




FT 


DOMAIN 


951 


1170 


C-TERMINAL . 




FT 


SITE 


926 


928 


CELL ATTACHMENT SITE 


(POTENTIAL) . 


FT 


DISULFID 


270 


270 


INTERCHAIN ( PROBABLE) 




FT 


DISULFID 


274 


274 


INTERCHAIN (PROBABLE) 




FT 


DISULFID 


391 


423 


BY SIMILARITY. 




FT 


DISULFID 


395 


428 


BY SIMILARITY. 




FT 


DISULFID 


406 


413 


BY SIMILARITY. 




FT 


DISULFID 


447 


484 


BY SIMILARITY. 




FT 


DISULFID 


451 


489 


BY SIMILARITY. 




FT 


DISULFID 


462 


474 


BY SIMILARITY. 




FT 


DISULFID 


504 


541 


BY SIMILARITY. 




FT 


DISULFID 


508 


546 


BY SIMILARITY. 




FT 


DISULFID 


519 


531 


BY SIMILARITY. 




FT 


DISULFID 


551 


562 


BY SIMILARITY. 




FT 


DISULFID 


556 


572 


BY SIMILARITY. 




FT 


DISULFID 


575 


586 


BY SIMILARITY. 




FT 


DISULFID 


592 


608 


BY SIMILARITY. 




FT 


DISULFID 


599 


617 


BY SIMILARITY. 




FT 


DISULFID 


620 


644 


BY SIMILARITY. 




FT 


DISULFID 


650 


663 


BY SIMILARITY. 




FT 


DISULFID 


657 


67 6 


BY SIMILARITY. 




FT 


DISULFID 


678 


689 


BY SIMILARITY. 




FT 


DISULFID 


705 


713 


BY SIMILARITY. 




FT 


DISULFID 


718 


738 


BY SIMILARITY. 




FT 


DISULFID 


754 


774 


BY SIMILARITY. 




FT 


DISULFID 


111 


797 


BY SIMILARITY. 




FT 


DISULFID 


813 


833 


BY SIMILARITY. 




FT 


DISULFID 


836 


856 


BY SIMILARITY. 




FT 


DISULFID 


874 


894 


BY SIMILARITY. 




FT 


DISULFID 


910 


930 


BY SIMILARITY. 




FT 


DISULFID 


946 


1167 


BY SIMILARITY. 




FT 


CARBOHYD 


248 


248 


N-LINKED (GLCNAC. . . 


) ( POTENTIAL ) . 


FT 


CARBOHYD 


360 


360 


N-LINKED (GLCNAC. . . 


) (POTENTIAL) . 


FT 


CARBOHYD 


708 


708 


N-LINKED (GLCNAC. . . 


) (POTENTIAL) . 


FT 


CARBOHYD 


1067 


1067 


N-LINKED (GLCNAC. . . 


) (POTENTIAL) . 


FT 


CONFLICT 


1025 


1025 


F -> L (IN REF. 2) . 




SQ 


SEQUENCE 


1170 


AA; 129646 


MW; 0443E493615E7F06 


CRC64; 



Query Match 5.6%; Score 270.5; DB 1; Length 1170; 

Best Local Similarity 32.2%; Pred. No. 2.4e-12; 



Matches 57; Conservative 24; Mismatches 71; Indels 25; Gaps 5 



Qy 207 VVRQARLADTAN YTCVAKN I VAR RRSASAAVIVYVNGGWSTWTEWSVCSASC 258 

: : : I I : I I : : I : I : I I I I I : I I I I : I 

Db 399 IQQRGRSCDSLNNRCEGSSVQTRTCHIQECDKRFKQ DGGWSHWSPWSSCSVTC 451 

Qy 259 GRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTAC-ATLCPVDGSWSPWSKWSACGLDC 314 

II | | | : | : | | III: : I I I I :: I I I I I I I : I 

Db 452 GDGVITRIRLCNSPSPQMNGKPCEGEARETKACKKDACPINGGWGPWSPWDICSVTCGGG 511 

Qy 315 THWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLCVHSASGPEDVALYVGLIAVAVC 371 

I I I I : : I I : I I : : I I : : I I II I III 

Db 512 VQRRSRLCNNPTPQFGGKDCVGDVTENQVCNKQDC PIDGCLSNPCFAGAKC 562 



RESULT 13 
TSP1_HUMAN 

ID TSP1_HUMAN STANDARD; PRT; 1170 AA. 

AC P07996; Q15667; 

DT 01-AUG-1988 (Rel. 08, Created) 

DT 01-AUG-1988 (Rel. 08, Last sequence update) 

DT 10-OCT-2003 (Rel. 42, Last annotation update) 

DE Thrombospondin 1 precursor. 

GN THBS1 OR TSP1 OR TSP. 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC TISSUE=Endothelial cells; 

RX MEDLINE-87057617; PubMed=2430973 ; 

RA Lawler J., Hynes R.O.; 

RT "The structure of human thrombospondin, an adhesive glycoprotein with 

RT multiple calcium-binding sites and homologies with several different 

RT proteins."; 

RL J. Cell Biol. 103:1635-1648(1986). 

RN [2] 

RP SEQUENCE FROM N.A. 

RX MEDLINE=89139590; PubMed=2 918029; 

RA Hennessy S.W., Frazier B.A. , Kim D.D., Deckwerth T.L., 

RA Baumgartel D.M. , Rotwein P., Frazier W.A. ; 

RT "Complete thrombospondin mRNA sequence includes potential regulatory 

RT sites in the 3' untranslated region."; 

RL J. Cell Biol. 108:729-736(1989). 

RN [3] 

RP SEQUENCE OF 1-397 FROM N.A. 

RX MEDLINE=87157592; PubMed=3030396; 

RA Kobayashi S., Eden-Mccutchan F., Framson P., Bornstein P.; 

RT "Partial amino acid sequence of human thrombospondin as determined by 

RT analysis of cDNA clones: homology to malarial circumsporozoite 

RT proteins."; 

RL Biochemistry 25:8418-8425(1986). 

RN [4] 

RP SEQUENCE OF 1-374 FROM N.A. 

RX MEDLINE=86287276; PubMed=34 61443; 

RA Dixit V.M., Hennessy S.W., Grant G.A., Rotwein P., Frazier W.A. ; 



RT "Characterization of a cDNA encoding the heparin and collagen binding 

RT domains of human thrombospondin . " ; 

RL Proc. Natl. Acad. Sci. U.S.A. 83:5449-5453(1986). 

RN [5] 

RP SEQUENCE OF 1-166 FROM N.A. 

RX MEDLINE=89291870; PubMed=2 544 587 ; 

RA Laherty CD., Gierman T.M., Dixit V.M.; 

RT "Characterization of the promoter region of the human thrombospondin 

RT gene. DNA sequences within the first intron increase transcription."; 

RL J. Biol. Chem. 264:11222-11227(1989). 

RN [6] 

RP SEQUENCE OF 1028-1170 FROM N.A. 

RA la Fleur M., Jobin C, Gauthier J. , Kreis C.G.; 

RL Submitted (XXX-1992) to the EMBL/ GenBank/DDBJ databases. 

RN [7] 

RP CARBOHYDRATE-LINKAGE SITES TRP-385; SER-394; TRP-438; TRP-441; 

RP THR-450; TRP-498 AND THR-507. 

RC TISSUE=Platelet; 

RX MEDLINE-21125860; PubMed=l 1067851 ; 

RA Hofsteenge J., Huwiler K.G., Macek B., Hess D., Lawler J. , 

RA Mosher D.F., Peter-Katalinic J. ; 

RT "C-mannosylation and O-f ucosylation of the thrombospondin type 1 

RT module. "; 

RL J. Biol. Chem. 276:6485-6498(2001). 

RN [8] 

RP THROMBOSPONDIN DOMAIN DISULFIDE BRIDGES. 

RX MEDLINE=22338361; PubMed=12450399 ; 

RA Huwiler K.G., Vestling M.M., Annis D.S., Mosher D.F.; 

RT "Biophysical characterization, including disulfide bond assignments, 

RT of the anti-angiogenic type 1 domains of human thrombospondin- 1 . " ; 

RL Biochemistry 41:14329-14339(2002). 

CC -!- FUNCTION: Adhesive glycoprotein that mediates cell-to-cell and 

CC cell-to-matrix interactions. Can bind to fibrinogen, fibronectin, 

CC laminin, type V collagen and integrins alpha-V/beta-1, alpha- 

CC V/beta-3 and alpha-IIb/beta-3 . 

CC -!- SUBUNIT: Homotrimer; disulf ide-linked . 

CC -!- SIMILARITY: Belongs to the thrombospondin family. 

CC -!- SIMILARITY: Contains 1 VWFC domain. 

CC -!- SIMILARITY: Contains 3 EGF~like domains. 

CC -!- SIMILARITY: Contains 3 TSP type-1 domains. 

CC -!- SIMILARITY: Contains 7 TSP type-3 domains. 

CC -!- SIMILARITY: Contains 1 TSP N-terminal (TSPN) domain. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 

DR EMBL; M25631; AAA36741.1; ~. 

DR EMBL; X04665; CAA28370.1; -. 

DR EMBL; X14787; CAA32889.1; -. 

DR EMBL; M14326; AAA61237.1; ALT_SEQ . 

DR EMBL; J04835; AAA61178.1; -. 

DR EMBL; M99425; AAB59366.1; 



DR 


PIR; A26155; TSHUP1. 


DR 


PDB; 1LSL; 18-DEC-02 . 


DR 


GlycoSuiteDB; P07996; 


DR 


Genew; HGNC: 11785; THBS1. 


DR 


MIM; 188060; 


DR 


GO; GO: 0004866; F: endopeptidase inhibitor activity; TAS. 


DR 


GO; GO: 0004871; F: signal transducer activity; TAS. 


DR 


GO; GO: 0007275; P : development ; TAS. 


DR 


InterPro; 


IPR001881; EGF Ca . 


DR 


InterPro; 


IPR006209; EGF like. 


DR 


InterPro; 


IPR006210; IEGF. 


DR 


InterPro; 


IPR000884; TSP1. 


DR 


InterPro; 


IPR008085; TSP 1. 


DR 


InterPro; 


IPR003367; tsp_3. 


DR 


InterPro; 


IPR008859; TSPC. 


DR 


InterPro; 


IPR003129; TSPN. 


DR 


InterPro; 


IPR001007; VWF C. 


DR 


Pfam; PF00008; EGF; 2. 


DR 


Pfam; PF00090; tsp_l; 3. 


DR 


Pfam; PF02412; tsp 3; 13. 


DR 


Pfam; PF05735; TSPC; 1. 


DR 


Pfam; PF02210; TSPN; 1. 


DR 


Pfam; PF00093; vwc; 1. 


DR 


PRINTS; PR01705; TSP1REPEAT. 


DR 


SMART; SM00181; EGF; 3. 


DR 


SMART; SM00209; TSP1; 3. 
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Query Match 5.6%; Score 268.5; DB 1 

Best Local Similarity 32.9%; Pred. No. 3.4e-12; 
Matches 51; Conservative 24; Mismatches 61 



(POTENTIAL) 
(POTENTIAL) 



Length 1170; 

Indels 19; Gaps 4; 
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Db 
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2 07 WRQARLADTANYT CVAKN I VAR RRSASAAVI VTWGGWSTWTEWSVCSASC 2 58 

: :: I I : I I : : I : I : I I I I I : I I I I : I 

399 IQQRGRSCDSLNNRCEGSSVQTRTCHIQECDKRFKQ DGGWSHWSPWSSCSVTC 451 

259 GRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTAC-ATLCPVDGSWSPWSKWSACGLDC 314 

II I I I : I : I I III: : II I I :: I I I I I I I : I 



Db 452 GDGVITRIRLCNSPSPQMNGKPCEGEARETKACKKDACPINGGWGPWSPWDICSVTCGGG 511 



Qy 315 THWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLC 34 9 

I I I I : : I I : I I : : I I : : I I 
Db 512 VQKRSRLCNNPTPQFGGKDCVGDVTENQICNKQDC 546 

RESULT 14 
TSP1_B0VIN 

ID TSP1_B0VIN STANDARD; PRT; 1170 AA. 

AC Q28178; Q28179; 

DT 01-NOV-1997 (Rel. 35, Created) 

DT 16-OCT-2001 (Rel. 40, Last sequence update) 

DT 15-MAR-2004 (Rel. 43, Last annotation update) 

DE Thrombospondin 1 precursor. 

GN THBS1 OR TSP1 OR TSP-1. 

OS Bos taurus (Bovine) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Cetartiodactyla; Ruminantia; Pecora; Bovoidea; 

OC Bovidae; Bovinae; Bos. 

OX NCBI_TaxID=9913; 

RN [1] 

RP SEQUENCE FROM N.A. 

RC STRAIN=Hol stein; TISSUE=Tooth; 

RX MEDLINE=98173773; PubMed=9507054 ; 

RA Ueno A., Yamashita K., Nagata T., Tsurumi C, Miwa Y., Kitamura S., 

RA Inoue H. ; 

RT " cDNA cloning of bovine thrombospondin 1 and its expression in 

RT odontoblasts and predentin."; 

RL Biochim. Biophys . Acta 1382:17-22(1998). 

RN [2] 

RP. SEQUENCE OF 1-18 AND 710-1170 FROM N.A. 

RC TISSUE=Aortic endothelium; 

RA Zafar R.S., Moll Y.D., Womack J.F., Walz D.A.; 

RL Submitted (JUN-1995) to the EMBL/ GenBank/DDBJ databases. 

CC -!- FUNCTION: Adhesive glycoprotein that mediates cell-to-cell and 

CC cell-to-matrix interactions. Can bind to fibrinogen, fibronectin, 

CC laminin, type V collagen and integrins alpha-V/beta-1, alpha- 

CC V/beta-3 and alpha-IIb/beta-3 . May play a role in dentinogenesis 

CC and/or maintenance of dentin and dental pulp. 

CC -!- SUBUNIT: Homotrimer; disulf ide-linked . 

CC -!- TISSUE SPECIFICITY: Odontoblasts. 

CC -!- SIMILARITY: Belongs to the thrombospondin family. 

CC -!- SIMILARITY: Contains 1 VWFC domain. 

CC -!- SIMILARITY: Contains 3 EGF-like domains. 

CC -!- SIMILARITY: Contains 3 TSP type-1 domains. 

CC -!- SIMILARITY: Contains 7 TSP type-3 domains. 

CC -!- SIMILARITY: Contains 1 TSP N-terminal (TSPN) domain. 

CC 

CC This SWISS-PROT entry is copyright. It is produced through a collaboration 

CC between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 

CC the European Bioinf ormatics Institute. There are no restrictions on its 

CC use by non-profit institutions as long as its content is in no way 

CC modified and this statement is not removed. Usage by and for commercial 

CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ 

CC or send an email to license@isb-sib . ch) . 

CC 
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(POTENTIAL) 
(POTENTIAL) 
(POTENTIAL) 
( POTENTIAL) 
(POTENTIAL) 



Query Match 5.5%; Score 265.5; DB 1; 

Best Local Similarity 32.9%; Pred. No. 5.8e-12; 
Matches 51; Conservative 24; Mismatches 61; 



Length 1170; 
Indels 19; 



Gaps 



Qy 

Db 

Qy 

Db 

Qy 

Db 
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: : : I I : I I : : I : I : I I I I I : I I I I : I 

399 IQQRGRSCDSLNNRCEGSSVQTRTCHIQECDKRFKQ DGGWSHWSPWSSCSVTC 451 

259 GRGWQKRSRSCTNPAPLNGGAFCEGQNVQKTAC-ATLCPVDGSWSPWSKWSACGLDC 314 

II I I I : I : I I III: s II I I :: I I I I I I I : I 

452 GDGVITRIRLCNSPSPQMNGKPCEGKARETKACQKDSCPINGGWGPWSPWDICSVTCGGG 511 

315 THWRSRECSDPAPRNGGEECQGTDLDTRNCTSDLC 34 9 

I I I I : : I I : I I : : I I : : I I 
512 VQKRSRLCNNPKPQFGGKDCVGDVTENQICNKQDC 546 



RESULT 15 
TSP2_CHICK 

ID TSP2_CHICK STANDARD; PRT; 1178 AA. 

AC P35440; 

DT 01-JUN-1994 (Rel. 29, Created) 

DT 01-JUN-1994 (Rel. 29, Last sequence update) 

DT 28-FEB-2003 (Rel. 41, Last annotation update) 



DE 
GN 
OS 
OC 
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OX 
RN 
RP 
RX 
RA 
RT 
RL 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
CC 
DR 
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DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 
DR 



Thrombospondin 2 precursor. 

THBS2 OR TSP2. 

Gallus gallus (Chicken) . 

Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 
Archosauria; Aves; Neognathae; Galliformes; Phasianidae; Phasianinae; 
Gallus . 

NCBI_TaxID=9031; 
[1] 

SEQUENCE FROM N.A. 

MEDLINE=91217 026; PubMed=2022631 ; 
Lawler J. , Duquette M. , Ferro P.; 

"Cloning and sequencing of chicken thrombospondin."; 
J. Biol. Chem. 266:8039-8043(1991). 

-!- FUNCTION: Adhesive glycoprotein that mediates cell-to-cell and 

cell-to-matrix interactions. Can bind to fibrinogen, fibronectin, 
laminin and type V collagen. 
SUBUNIT: Homotrimer; disulf ide-linked. 
SIMILARITY: Belongs to the thrombospondin family. 
SIMILARITY: Contains 1 VWFC domain. 

3 EGF-like domains. 



Contains 
Contains 3 TSP type-1 domains, 



SIMILARITY: 
SIMILARITY: 

SIMILARITY: Contains 7 TSP type-3 domains. , 
SIMILARITY: Contains 1 TSP N-terminal (TSPN) domain. 

This SWISS-PROT entry is copyright. It is produced through a collaboration 
between the Swiss Institute of Bioinf ormatics and the EMBL outstation - 
the European Bioinf ormatics Institute. There are no restrictions on its 
use by non-profit institutions as long as its content is in no way 
modified and this statement is not removed. Usage by and for commercial 
entities requires a license agreement (See http://www.isb-sib.ch/announce/ 
or send an email to license@isb-sib . ch) . 

EMBL; M60853; AAA51437.1; -. 

PIR; A39804; A39804 . 

HSSP; P00740; 1EDM. 
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InterPro; IPR006209; EGF_like. 
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InterPro; IPR003367; tsp_3. 

InterPro; IPR008859; TSPC. 

InterPro; IPR003129; TSPN. 
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Pfam; PF00008; EGF; 1. 

Pfam; PF00090; tsp_l; 3. 

Pfam; PF02412; tsp_3; 13. 

Pfam; PF05735; TSPC; 1. 

Pfam; PF02210; TSPN; 1. 

Pfam; PF00093; vwc; 1. 

PRINTS; PR01705; TSP1REPEAT. 

SMART; SM00181; EGF; 2. 

SMART; SM00209; TSP1; 3. 
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SEQUENCE 



463 4 63 N-LINKED (GLCNAC. 

590 590 N-LINKED (GLCNAC. 

716 716 N-LINKED (GLCNAC. 

1075 1075 N-LINKED (GLCNAC. 



.) ( 
■ ) 



1178 AA; 131816 MW; F37E02F42C87 17A2 CRC64; 



(POTENTIAL) 
(POTENTIAL) 
POTENTIAL) 
(POTENTIAL) 



Query Match 5.5%; Score 263; DB 1; Length 1178; 

Best Local Similarity 36.2%; Pred. No. 9.1e-12; 

Matches 58; Conservative 16; Mismatches 70; Indels 16; 



Gaps 



Qy 

Db 

Qy 

Db 

Qy 

Db 



210 QARLADTANYTCVAKNIVAJ^RRS-ASAAVIVTWGGWSTWTEWSVCSASCGRGWQKRSRS 268 

•I I I : I I I : : I II I I : I I I I : I I I I I 

410 RGRSCDVTRSACTGPHIQTRMCSFKKCDHRIRQDGGWSHWSPWSSCSVTCGVGNITRIRL 4 69 

269 CTNPAPLNGGAFCEGQNVQKTACATL-CPVDGSWSPWSKWSACGLDC THWRSRECSD 324 

l-ll M II : I I I I : I I I I I I I I I : I I I I | : 

470 CNSPIPQMGGKNCVGNGRETEKCEKAPCPVNGQWGPWSPWSACTVTCGGGIRERSRLCNS 529 

325 PAPRNGGEECQGTDLDT RNCTSDLCVHSASGP 356 

I I : I I : I I II I : I I I : : I 

530 PEPQYGGKPCVG DTKQHDMCNKRDCPIDGCLSNPCFP 566 



Search completed: July 12, 2004, 22:57:51 
Job time : 24 sees 



